I'm completely new to web scraping, so any reference sites would be great. I am slightly confused as to how I'm getting the actual data. When I print(theText), I get a bunch of html code (which should be correct). How do I exactly go about getting values from this? Do I have to use regular expressions to get the actual numerical data?
def getData():
request = urllib.request.Request("http://ift.tt/1HliMuc")
response = urllib.request.urlopen(request)
the_page = response.read()
theText = the_page.decode()
print(theText)
Aucun commentaire:
Enregistrer un commentaire