web: Webscraping with Python Requests and getting Access Denied even after updating headers

dimanche 28 mars 2021

Webscraping with Python Requests and getting Access Denied even after updating headers

this webscraper was working for a while but the website must have been updated so it no longer works. After each request I get an Access Denied error, I have tried adding headers but still get the same issue. This is what the code prints:

</html>

<html><head>
<title>Access Denied</title>
</head><body>
<h1>Access Denied</h1>

You don't have permission to access "http://www.jdsports.co.uk/product/white-nike-air-force-1-shadow-womens/15984107/" on this server.<p>
Reference #18.4d4c1002.1616968601.6e2013c
</p></body>
</html>

Heres the part of the code to get the HTML:

scraper=requests.Session()

headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36',
}
            
html = scraper.get(info[0], proxies= proxy_test, headers=headers).text
soup = BeautifulSoup(html, 'html.parser')

print(soup)
stock = soup.findAll("button", {"class": "btn btn-default"})

What else can I try to fix it? The website I was to scrape is https://www.jdsports.co.uk/

web

dimanche 28 mars 2021

Webscraping with Python Requests and getting Access Denied even after updating headers

Aucun commentaire:

Enregistrer un commentaire