I am implementing the following code in spider for scraping shoes from an ecommerce website.
import scrapy
class HugobossSpider(scrapy.Spider):
name = 'hugoboss'
allowed_domains = ['hugoboss.com/de/boss-herren-neuheiten-schuhe/']
start_urls = ['http://hugoboss.com/de/boss-herren-neuheiten-schuhe//']
def parse(self, response):
#Extracting the content using css selectors
url = response.xpath('//div/@data-mouseoverimage').extract()
product_title = response.xpath('//*[@class="product-tile__productInfoWrapper product-tile__productInfoWrapper--is-small font__subline"]/text()').extract()
price = response.css('.product-tile__offer .price-sales::t Zext').getall()
#Give the extracted content row wise
for item in zip(url,product_title,price):
#create a dictionary to store the scraped info
scraped_info = {
'url' : item[0],
'product_title' : item[1],
'price' : item[2]
}
And the shell is returning output normally like this
But, the output CSV file looks so unorganized like this,
I don't get where the problem is happening. Can anyone help ? Thanks
Aucun commentaire:
Enregistrer un commentaire