I am new into python and I try to get some song names from my favorite radio station website but whatever I do, I can not get into div ui-view="main.header" class="ng-scope"
to get de songs names.
With my code i can read from txt just the first level of divs but not deeper:
<div id="audio-player" style="width: 0px; height: 0px"></div>
<div id="fb-root"></div>
<div ui-view="main.header"></div>
<div ui-view="main.content"></div>
<div ui-view="main.footer"></div>
The song list has a refresh rate of 10s, is that area blocked for scraping because of that? I have tried also with div1 = soup.findAll(div)
, with no succes.
You can see the full web site code at www.rockfm.ro
Code for parsing:
<head></head>
<body ng-class="bodyClass">
<script src="https://www.youtube.com/iframe_api" data-remove="false"></script>
<script src="http://ift.tt/om8mte" data-remove="false"></script>
<script src="http://ift.tt/2hugToZ" data-remove="false"></script>
<script data-remove="false">
<script data-remove="false">
<div id="audio-player" style="width: 0px; height: 0px">
<div id="fb-root" class=" fb_reset">
<!-- uiView: main.header -->
<div ui-view="main.header" class="ng-scope">
<div id="topnav" ng-controller="HeaderCtrl" class="ng-scope"><
<div class="container top-stripe">
</div>
<div class="container menu-expand" ng-class="{'show-expand':isMenuOpen}">
<div class="col-md-3">
<div class="col-md-6">
<div class="col-md-3 menu-expand-latest-tracks">
<div class="latest-tracks ng-isolate-scope" track-list="trackList.lista">
<h4>Ultimele 10 piese</h4>
<ul>
<!-- ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Steve Stevens - Top Gun Anthem</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Boston - More Than A Feeling</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Rammstein - Mein Hertz Brennt</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Inxs - Never Tear Us Apart</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Nirvana - Smells Like Teen Spirit</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Rockfm - Stiri</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Phoenix - Nunta</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Survivor - Burning Heart</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">Holograf - Banii Vorbesc</li>
<!-- end ngRepeat: track in trackList.lista -->
<li ng-repeat="track in trackList.lista" class="ng-binding ng-scope">It Rocks</li>
<!-- end ngRepeat: track in trackList.lista -->
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
This is my code:
import urllib
from BeautifulSoup import *
url = "www.rockfm.ro"
html = urllib.urlopen('http://www.rockfm.ro').read()
soup = BeautifulSoup(html)
div1 = soup.findAll(True)
#code to get into divs` classes
for div2 in div1:
print("Level 1: "+ str(div2))
with open('rock.txt', 'a') as file:
file.write("Level 1: " + str(div2) + "\n")
div3 = div2.findAll(True)
for div4 in div3:
print ("Level 2: "+ str(div4))
with open('rock.txt', 'a') as file:
file.write("Level 2: " + str(div4) + "\n")
div5 = div4.findAll(True)
for div6 in div5:
print ("Level 3:" + str(div6))
with open('rock.txt', 'a') as file:
file.write("Level 3: " + str(div6) + "\n")
div7 = div6.findAll(True)
for div8 in div7:
print ("Level 4:" + str(div8))
with open('rock.txt', 'a') as file:
file.write("Level 3: " + str(div8) + "\n")
div9 = div8.findAll(True)
for div10 in div9:
print ("Level 4:" + str(div10))
with open('rock.txt', 'a') as file:
file.write("Level 4: " + str(div10) + "\n")
Thank you very much in advance!
Aucun commentaire:
Enregistrer un commentaire