I am trying to pull only the links and their text from a webpage line by line and insert text and link into a dictionary. Without using beautiful soup or a regex.
i keep getting this error:
error:
Traceback (most recent call last):
File "F:/Homework7-2.py", line 13, in <module>
link2 = link1.split("href=")[1]
IndexError: list index out of range
code:
import urllib.request
url = "http://www.facebook.com"
page = urllib.request.urlopen(url)
mylinks = {}
links = page.readline().decode('utf-8')
for items in links:
links = page.readline().decode('utf-8')
if "a href=" in links:
links = page.readline().decode('utf-8')
link1 = links.split(">")[0]
link2 = link1.split("href=")[1]
mylinks = link2
print(mylinks)
Aucun commentaire:
Enregistrer un commentaire