I am currently learning how to use XPath to extract information from an HTML document. I am using python and have had no trouble getting values of things like the title of a webpage, but when I try to get the text of a particular cell in a table, I simply get an empty value returned.
Here is my code, I used chrome to copy the XPath of the table cell I want to get the value from.
from lxml import html
import requests
page = requests.get('https://en.wikipedia.org/wiki/List_of_Olympic_Games_host_cities')
tree = html.fromstring(page.content)
#This will get the cell text:
location = tree.xpath('//*[@id="mw-content-text"]/div/table[1]/tbody/tr[1]/td[3]/text()')
print('Location: ', location)
Aucun commentaire:
Enregistrer un commentaire