mardi 20 mars 2018

How can I retrieve titles of various articles from one website using python web scraping?

I am looking to get topic of all the articles available in this webpage " " using python web crawler. I am very new to html. This is the code I have so far which I got as a reference from different examples. Would somebody please help me undersand this and get the correct code?

from urllib2 import urlopen

from urllib2 import HTTPError

from urllib2 import URLError

from bs4 import BeautifulSoup

try:

     html = urlopen("https://query.nytimes.com/search/sitesearch/")
except HTTPError as e:

    print(e)

except URLError:

    print("Server down or incorrect domain")

else:

    res = BeautifulSoup(html.read(),'html.parser')

    tags = res.findAll("h2", {"class": "widget-title"})


    for tag in tags:

        print(tag.getText())




Aucun commentaire:

Enregistrer un commentaire