jeudi 2 mai 2019

How to implement web crawler in python to ignore calendar traps without hard coding the keyword calendar?

I am writing a web crawler and I have an isValid function that is determining if URLs are valid or not. I want to implement the web crawler to ignore the calendar traps without hardcoding the word calendar. I have to figure out how to do that by parsing the query and see if the day is changing while month and year does not change so that it shows that its a trap. I am confused about how to implement this. Can anyone help me with this code in python?




Aucun commentaire:

Enregistrer un commentaire