jeudi 21 janvier 2016

xpath only extracting some of the data from the site

I am using xpath and python to try and get data from the site in the code.I have managed to download most of the data(after a fashion) but I can't extract the Greyhound data field and the Dogdetail also come out rather odd.The Greyhound data is actually an a tag href path and after trying various variations on the xpath I still can't seem to get the data out.The overall plan is to download the days dog racing results,into a database(or spreadsheet)Any help appreciated.

 from lxml import html
 import requests


 page = requests.get('http://ift.tt/1ZEFlxA')
 tree = html.fromstring(page.content)

 track=tree.xpath('//div[@class="track"]/text() ')
 print 'Track',track

 date=tree.xpath('//div[@class="date"]/text() ')
 print 'date',date

 datetime=tree.xpath('//div[@class="datetime"]/text() ')
 print 'datetime', datetime

 essentialgreyhound=tree.xpath('//a[@href="essential greyhound"]/text() ')
 print 'Greyhound', essentialgreyhound

 firstessentialfin= tree.xpath('//li[@class="first essential fin"]//text()')
 print 'Position:', firstessentialfin
 sp= tree.xpath('//li[@class="sp"]/text() ')
 print 'StartingPrice:', sp
 trap= tree.xpath('//li[@class="trap"]/text() ')
 print 'Trap:', trap
 trainer= tree.xpath('//li[@class="essential trainer"]/text() ')
 print 'Trainer:', trainer
 timeSec=tree.xpath('//li[@class="timeSec"]/text() ')
 print 'TimeSec',timeSec
 timeDistance=tree.xpath('//li[@class="timeDistance"]/text() ')
 print 'TimeDistance',timeDistance

 firstessentialcomment=tree.xpath('//li[@class="first essential comment"]/text() ')
 print 'Comment',firstessentialcomment
 firstessential=tree.xpath('//li[@class="first essential"]/text()')
 print 'DogDetail', firstessential




Aucun commentaire:

Enregistrer un commentaire