vendredi 20 février 2015

Web scraping / crawling to find image hyperlinks

I have a task to scrape through a website (I have both ftp and http versions with login & password) to get the all image hyperlinks. I have an excel sheet with many columns and have a unique product sku number column (primary key) which is also a partial string of the jpg image filename. I need a crawler to find all matching jpgs (could be 1 or more) inside the multiple folders for that particular product sku / row. Then to populate these http links on new columns. Thinking python, but not too great at that language. Any ideas? Thanks guys!





Aucun commentaire:

Enregistrer un commentaire