I am looking to get topic of all the articles available in this webpage " " using python web crawler. I am very new to html. This is the code I have so far which I got as a reference from different examples. Would somebody please help me undersand this and get the correct code?
from urllib2 import urlopen
from urllib2 import HTTPError
from urllib2 import URLError
from bs4 import BeautifulSoup
try:
html = urlopen("https://query.nytimes.com/search/sitesearch/")
except HTTPError as e:
print(e)
except URLError:
print("Server down or incorrect domain")
else:
res = BeautifulSoup(html.read(),'html.parser')
tags = res.findAll("h2", {"class": "widget-title"})
for tag in tags:
print(tag.getText())
Aucun commentaire:
Enregistrer un commentaire