Webscraping sites like stockx and goat for shoe information but the html from the soup I create doesn't include the information I need which is apparently found in the root div under the body of the page. When I inspect the page manually the root div is full of information that I am trying to scrape but viewing the page source shows an empty root div.
def scrape(url):
browser = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0"
header = {"User-Agent": browser,}
req = urllib.request.Request(url, headers=header)
html = urllib.request.urlopen(req).read()
soup = BeautifulSoup(html, "html.parser")
print(soup.findAll("div",{"id":"root"}))
yields [<div id="root"></div>]
as the result for either stockx or goat search result webpage.
If anyone can let me know how I can extract the information I need I would be greatly appreciative. Thank!
Aucun commentaire:
Enregistrer un commentaire