I just started experimenting with python and BeautifulSoup.
I want to get the links to articles that are related to a specific city
Here is the current code
import requests
from bs4 import BeautifulSoup
city = "london"
result = requests.get('https://www.sample.com/search/index.html?q=' + city)
def main_loop():
soup = BeautifulSoup(result.content, features="lxml")
articles = soup.find("div", "oc-articleList")
print(articles)
if result.status_code == 200:
main_loop()
else:
print('error:', result.status_code)
The result is:
<div class="oc-articleList"></div>
The first thing I tried was getting the articles with:
articles = soup.find_all("article")
But it could find anything.
If you check the sites source code it looks something like this:
<div class="oc-articleList">
<article>...</article>
<article>...</article>
<article>...</article>
<article>...</article>
.
.
.
</div>
How can I make BS parse deeper into the DOM?
Aucun commentaire:
Enregistrer un commentaire