jeudi 3 mars 2016

Help! Web crawling, Scrapy, Beautiful Soup

I've had a look for possible ways to solve the task I need to perform but I'm yet to find a solution. If anyone can help it would be appreciated. I don't think the problem is too elaborate but my skills are limited.

The Task: I have a list of around 300+ companies in a spreadsheet. I need to search the web and return any results or links in which both the company name and another word is mentioned. - "Manchester"

I know I need to use a web crawler or other code to do this but most of the examples I find use a single company website (I just have the company name and don't want to manually find the url address of each website - This may be another task I need to perform!). I also don't want to limit the search to just the company websites but also search for instances where the company name and "Manchester" have appeared in news articles etc.

Basically I need to find a piece of code which searches in google as follows 1 - "Company A Manchester" 2 - "Company B Manchester" 3 - "Company C Manchester.......and so on.

I'd like to return any links so I can then physically visit the page where the company and the word "Manchester" are mentioned.

Any help or ideas appreciated. Thanks in advance!




Aucun commentaire:

Enregistrer un commentaire