Are there any scraping technologies out there (eg: Scrapy) that check for a sites T&Cs before scraping?
I’d like to filter my scraping so I don’t scrape sites that prohibit “automation/scraping/bots” in their T&Cs - perhaps by searching for keywords in the T&Cs and moving on if they are detected
This is in addition to following a sites robots.txt
Aucun commentaire:
Enregistrer un commentaire