lundi 21 novembre 2016

Need help in web scrapping using python where java script is used with lazy loading concept

I am using beautifulsoup to parse the HTML after applying scroll window method of selenium driver, page_source has given whole HTML page code but when I aply beautiful.findall on entire page_source then I am not getting all the data only first 15 records I am getting.

from bs4 import BeautifulSoup
from selenium import webdriver
import requests
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Chrome("E:\\Softwares\\drivers\\chromedriver.exe")
driver.get("https://www.example.com")
driver.maximize_window()
htmlElem = driver.find_element_by_tag_name('body')
no_of_pagedowns = 20

while no_of_pagedowns:
    htmlElem.send_keys(Keys.PAGE_DOWN)
    time.sleep(0.2)
    no_of_pagedowns-=1
time.sleep(10)
soup = BeautifulSoup(driver.page_source, "html.parser")
div = soup.find_all('div', attrs={'class': 'description padding-tb_9px-rl_12px'})




Aucun commentaire:

Enregistrer un commentaire