mardi 23 juillet 2019

web scraping a table from interactive websites using rvest

I'm trying to scrape the table on this interactive web page https://games.crossfit.com/leaderboard/open/2019?country_champions=0&division=1&citizenship=US&citizenship_display=United+States&sort=0&scaled=0&page=1

Below is my original code:

url='https://games.crossfit.com/leaderboard/open/2019?country_champions=0&division=1&citizenship=US&citizenship_display=United+States&sort=0&scaled=0&page=1'
US_male=read_html(url)%>%
  html_nodes('#leaderboard')%>%
  html_nodes('div.lb-main.container')%>%
  html_nodes('div table')

after doing this, it returns {xml_nodeset (0)}, but if I shorten it to

US_male=read_html(url)%>%
  html_nodes('#leaderboard')%>%
  html_nodes('div.lb-main.container')

it returns

{xml_nodeset (1)}
[1] <div class="lb-main container"></div>

if you inspect the web page, there is the tag for table body<tbody> under <table class="desktop athletes">. I'm not able to understand why the content of the table not showing up and how should I correctly scrape the table?

Whether you use R or python, I can learn both if it would work. I'd appreciate it!




Aucun commentaire:

Enregistrer un commentaire