vendredi 4 juin 2021

how do I extract huge data from a page and the links associated in the page in websites using python

I have been trying to scrape data from webpages for data analytics project and I managed successfully to get the data from a single page.

import requests
from bs4 import BeautifulSoup
import concurrent.futures
from urllib.parse import urlencode
from scraper_api import ScraperAPIClient


    client = ScraperAPIClient('key')
    results = client.get(url = "https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate").text
    
    print(results)

I need to extract data from the first page and the link that are attached to the page as well.

For an example from the site "https://ift.tt/3vSHfCj" I need to navigate inside each courses and extract the duration or the html code of that age.

For now i only managed to extract html code of the single page but im not sure how to navigate inside a course and get the html code of that page too. any insights will help




Aucun commentaire:

Enregistrer un commentaire