for numb in range (50000, 100000):
address = ('http://ift.tt/1Pum8KK') %numb
html = urllib2.urlopen(address).read()
regex = pattern.findall(html)
clean = "\n".join(regex)
text_file.write(clean)
print numb
The script runs fine when scraping range (1,1000) but gets so slow when trying to scrape above 10000 for example the script above I tried to scrape from 50000 to 100000. what could possibly cause this? mind you that i can enter the website from my browser in less than 1/ms so its not a problem from the connection.
Aucun commentaire:
Enregistrer un commentaire