samedi 2 mai 2020

Using Beautifulsoup to scrape web data - having issues pulling what i need

I am trying to scrape contents of a table from a website in Python using BeautifulSoup, but the table has several pages, and i cant work out how to scrape anything past page 1 in the table. My current script works fine for extracting page 1 of the table, but its page 2 onwards that i am struggling to get.

The page URL for tab 2 of the table is the same as page 1, so can't iterate through url pages.

Thanks in advance

'''script

import requests 
from bs4 import BeautifulSoup
import pandas as pd
import json

result = requests.get('https://www.footballindex.co.uk/players')

src = result.content

soup = BeautifulSoup(src,'lxml')

players1 = soup.find("script").text

players2 = players1.split('= ', 1)[1]

players3 = json.loads(players2)


df = pd.DataFrame(   
[item['id'],item['country'],item['nationalTeam 
,item['sector'],item['nationality'],item['team']
,item['buyPrice'],item['sellPrice'],item['penceChange']
,item['changePercent']]for item in players3['playersReducer']['players']
) 



Aucun commentaire:

Enregistrer un commentaire