I am using Python to retrieve HTML from a webpage and then parsing it in the MyHtmlParer class. If I find certain data in the HTML, I want to return it to the main method. I have printed the data results while still in the MyHtmlParser class so I know it is finding what I want, but I do not know how to return the data to my main method.
import urllib2
from MyHtmlParser import MyHtmlParser
def HtmlRetrieve(url):
req = urllib2.Request(url, headers={'User-Agent': "Magic Browser"})
con = urllib2.urlopen(req)
return con.read()
def main():
url = "someUrl.com"
html = HtmlRetrieve(url)
parser = MyHtmlParser()
parser.feed(html)
print parser.links
main()
Then this is my MyHtmlParser Class
from HTMLParser import HTMLParser
class MyHtmlParser(HTMLParser):
def __init__(self):
HTMLParser.__init__()
self.links = []
def handle_data(self, data):
if data == "some text":
self.links.append(data)
Why is the data not being returned to my main method?
Aucun commentaire:
Enregistrer un commentaire