jeudi 9 mai 2019

Intercepting AJAX calls while web-scraping

I am trying to scrape an appStore, particularly the reviews for every app. The issue is that the reviews are dynamic and the client-side is making an AJAX POST call to get that data. The payload for this call is also being generated dynamically for every page. I was wondering if there is any way to intercept this request through code to get the payload, using which I could make a call using requests or any other library. I am able to make this call through POSTMAN using the parameters I get from inspect Network Activity utility of the web-browsers.

I could use selenium to scrape the finally loaded page- but it waits for the entire page to load which is highly un-optimized since I don't need to wait for the entirety of the page to load.

payload = "<This is dynamically created for every page and is constant for that given page>"
header = {"Content-Type": "application/x-www-form-urlencoded"}
url = 'https://appexchange.salesforce.com/appxListingDetail'

r = requests.post(url=url, data=payload, headers=header)

I was wondering if its possible to get this payload through the scraper which can intercept all the AJAX calls made when it tries to scrape the base web-page.




Aucun commentaire:

Enregistrer un commentaire