For a faculty project i wish to scrape some news webpages. Here i encountered a problem, because when i try to parse HTML code to python i get HTML that is in Source of a page, which is a lot different than Elements shown in Inspect page. I have used BeautifulSoup, requests and Selenium and got the same result.
Does anyone have any idea, how i could scrape Elements of a page, if i cannot get HTML code of the page or how to get HTML code of the page to scrape it.
from selenium import webdriver
url = 'https://www.24ur.com/novice/korona/v-revozu-znova-zagnali-proizvodnjo.html'
driver = webdriver.Chrome()
driver.get(url)
htmlx = driver.execute_script("return document.documentElement.outerHTML")
print(htmlx)
Thank you.
Aucun commentaire:
Enregistrer un commentaire