Is there a standard section field to look for while scraping news articles?
Explanation: I would like to get extract the title of an article (news) and its associated category or section.
Example:
Article:
http://ift.tt/2xzYLAi
Title of the article:
Environmentalists: UK's Antarctic islands need protection
Section or Category:
Science and Environment
There are various categories such as politics, lifestyle, tech, sports, etc. I checked the BBC and the guardian. They have different fields to specific these sections.
I expect that it might be different for various news websites. However, could it be that these different fields are already known so I can look for them while scraping?
Ideally, is there already a library which provides such as a category extraction (in Python)? I am going to write one myself so if one already exists then I do not want to reinvent the wheel.
Aucun commentaire:
Enregistrer un commentaire