jeudi 22 avril 2021

(Python) Cant scrape data from my targeted site anymore using re, requests, and json

I'm having a problem where i can scrape data from a website by using the java pathing. I'm trying to scrape from Rocket League Tracker. here's my code:

import requests
import re
import json
import math

def rankGetter():

    trackerLink = 'https://rocketleague.tracker.network/rocket-league/profile/epic/DirectPanda/overview'

    # now we have the tracker link we're going to scrape the website
    # all the HTML of the site is now in result
    result = requests.get(trackerLink)

    # checker to make sure the user used the correct information
    if result.status_code == 400:
        print('profile not found')

    else:
        # Extract everything needed to render the current page. Data is stored as Json in the
        # JavaScript variable: window.__INITIAL_STATE__={"route":{"path":"\u0 ... }};
        json_string = re.search(r"window.__INITIAL_STATE__\s?=\s?(\{.*?\});", result.text).group(1)

        # convert text string to structured json data
        rocketleague = json.loads(json_string)

        # Save structured json data to a text file that helps you orient yourself and pick
        # the parts you are interested in.
        with open('rocketleague_json_data.txt', 'w') as outfile:
            outfile.write(json.dumps(rocketleague, indent=4, sort_keys=True))


The error is the text doc made doesn't have the ranks I want anymore.

"stats": {
    "standardLeaderboardLeaders": {},
    "standardLeaderboards": [],
    "standardPlayers": {},
    "standardTitles": {}
},
**"stats-v2": {
    "segments": {},
    "standardProfileMatches": {},
    "standardProfileSummaries": {},
    "standardProfiles": {},
    "standardProfilesHistory": {},
    "standardSessions": {},
    "subscriptions": {}
},**
"titles": {
    "currentTitle": {
        "name": "Rocket League",
        "platforms": [

The Ranks should be under stats-V2 but as you can see its empty now. whats happening and how do i fix it? I was able to get ranks for a week but all the sudden it stopped working today.




Aucun commentaire:

Enregistrer un commentaire