jeudi 22 octobre 2015

Web Parsing Python - Trying to get faculty names between the 'strong' tags

from bs4 import BeautifulSoup #imports beautifulSoup package
import urllib2

url2 = 'http://ift.tt/1OK85nM'
page2 = urllib2.urlopen(url2)
soup2 = BeautifulSoup(page2.read(), "lxml")

row2 = soup2.findAll('p')
row2 = row2[18:-4] 

names2 = []
arrayNameLength = len(row2)
for x in names2:
    current2 = row2[x]
    currentString2 = current2.findAll('strong')
    if len(currentString2) > 0:
        currentString2 = currentString2[0]
        names2.append(currentString2.text)

Hey guys, here's my code and essentially I'm trying to scrape the faculty names from the above site. I guess I'm having trouble grabbing the names from within the strong tags for all of the list of names. Any help what I'm doing wrong? Many thanks!




Aucun commentaire:

Enregistrer un commentaire