vendredi 26 mars 2021

Python urllib how to load all web browser content

I am webscrapping a page that has "Load more" button at the bottom of the page. When I scrape for the content I am looking for I only get the data up to the "Load more" button.

How can I get all data using python urllib library.

from urllib.request import urlopen

homeUrl = "www.somewebpage.com"
homePage = urlopen(homeUrl)
home_html_btyes = homePage.read()
home_html = home_html_btyes.decode("utf-8")

I've found something similar to my problem, and it looks like I might have to use a package called selenium but Im not sure.




Aucun commentaire:

Enregistrer un commentaire