samedi 13 avril 2019

Price Slasher-Price Collector

Problem Statement:

PriceSlasher is a discounts and coupon search platform giving access for the customers, an aggregate platform for all the offers across various brands and firms having their presence online. To create a database of all the offers available in any particular day needed a dynamic solution. There is no possibility of attaining the APIs from any of the target e-commerce players. The firm has to collect data from the websites of each e-commerce player and knit them into a single dataset and do some classification operations to seperate the offers w.r.t. credit card, the type of offer, the product related to offer, etc. The company wants to initiate the scraping work starting from the website TheFleet.com.

Jack, an employee at PriceSlasher is given the responsibility to collect the data from websites and also to update it regularly. But, the project faced a hurdle when PriceSlasher started blocking highly-frequent pings from single IP address to avoid external parties from scraping data. Now Jack has the onus of working around the situation and create a dynamic tool which gathers data from multiple websites avoiding any interruption from the target sites.

Please, work around a solution for PriceSlasher. Imagine you are jack




Aucun commentaire:

Enregistrer un commentaire