vendredi 28 mai 2021

Instagram web scraping with selenium Python problem

i have a problem with scraping all pictures from instagram profile, I'm scrolling the page till bottom then find all "a" tags finally always i get only last 30 links to pictures. I think that driver doesn't see full content of page.

#scroll
scrolldown = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var scrolldown=document.body.scrollHeight;return scrolldown;")
match=False
while(match==False):
    last_count = scrolldown
    time.sleep(2)
    scrolldown = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var scrolldown=document.body.scrollHeight;return scrolldown;")
    if last_count==scrolldown:
        match=True

#posts
posts = []
time.sleep(2)
links = driver.find_elements_by_tag_name('a')
time.sleep(2)
for link in links:
    post = link.get_attribute('href')
    if '/p/' in post:
        posts.append(post)



Aucun commentaire:

Enregistrer un commentaire