dimanche 22 octobre 2017

web scraping next page

I have a problem trying to web scrap some data from various pages. I've tried to Google some solutions, but didn't work at all.

My point is web scrap just the names of the Graphics cards from this website: "http://ift.tt/2gBAbZC"

In the first place, I tried to build a code that works just with one of the pages. And that works pretty well.

#

    from urllib2 import urlopen as uReq
    from bs4 import BeautifulSoup as soup
    import requests
    import re

    my_url = "http://ift.tt/2gBAbZC"
    uClient = uReq(my_url)
    page_html = uClient.read()
    uClient.close()
    page_soup = soup(page_html, "html.parser")

    containers = page_soup.findAll("div",{"class":"item-container"})
    container = containers[0]

    for container in containers:
        title_container = container.findAll("a",{"class":"item-title"})
        product_name = title_container[0].text
        print("product_name: " + product_name)

#

With this, I got the names of the Graphic Card at the page 2. If I switch it to 1 at the HTML I can get the names of the first one as well.

#

I tried to make a loop to handle it, but it seems just to return the first page over and over again.

#

    i = 1
    my_url = "http://ift.tt/2yKAckE".format(i)
    while i <= 3: 
        uClient = uReq(my_url)
        page_html = uClient.read()
        uClient.close()
        page_soup = soup(page_html, "html.parser")

        # esse é o que vou usar para fazer o loop

        containers = page_soup.findAll("div",{"class":"item-container"})

        container = containers[0]


        for container in containers:
            title_container = container.findAll("a",{"class":"item-title"})
            product_name = title_container[0].text

            print("product_name: " + product_name)

        i = i+1

#

Does anyone can help me with that? =D

PS: Be free to change my code and propose a better solution, folks. PS 2: Python 3.5 at Jupyter Lab.




Aucun commentaire:

Enregistrer un commentaire