Thanks for looking into my issue here, I am trying to get the next page link from an old Reddit blog page but somehow find method is returning me None object, the code :
def crawl(self):
curr_page_url = self.start_url
curr_page = requests.get(curr_page_url)
bs = BeautifulSoup(curr_page.text,'lxml')
# all_links = GetAllLinks(self.start_url)
nxtlink = bs.find('a',attrs={'rel':'nofollow next'})['href']
print(nxtlink)
and the HTML page link is Old Reddit page link on this page I'm trying to get the next pages' link is in a span tag this one :
<span class="next-button">
<a href="https://old.reddit.com/r/learnprogramming/?count=25&after=t3_j54ae2" rel="nofollow
next">next ›
</a>
</span>
Aucun commentaire:
Enregistrer un commentaire