here is my code to scrape only one page but I have 11000 of them. The difference is in their id.
https://www.rlsnet.ru/mkb_index_id_1.htm
https://www.rlsnet.ru/mkb_index_id_2.htm
https://www.rlsnet.ru/mkb_index_id_3.htm
....
https://www.rlsnet.ru/mkb_index_id_11000.htm
How can I loop my code to scrape all that 11000 pages? is it even possible with such a big amount of pages ? It is possible to put them into a list and then scrape but with 11000 of them it will be a long way.
import requests
from pandas import DataFrame
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup
page_sc = requests.get('https://www.rlsnet.ru/mkb_index_id_1.htm')
soup_sc = BeautifulSoup(page_sc.content, 'html.parser')
items_sc = soup_sc.find_all(class_='subcatlist__item')
mkb_names_sc = [item_sc.find(class_='subcatlist__link').get_text() for item_sc in items_sc]
mkb_stuff_sce = pd.DataFrame(
{
'first': mkb_names_sc,
})
mkb_stuff_sce.to_csv('/Users/gfidarov/Desktop/Python/MKB/mkb.csv')
Aucun commentaire:
Enregistrer un commentaire