web: Unable to Scrape the threads using rvest in R

mercredi 19 décembre 2018

I need to scrape the threads and replies from the following website:

I tried this code:

N_pages <- 5

A <- NULL

D<-NULL

for (j in 1: N_pages){

review <- read_html(paste0(url, j))

threads<- cbind(review %>% html_nodes(".threadtitle") %>% html_text() )

author <- cbind(review %>% html_nodes(".label") %>% html_text() )

X<- rbind(A, threads, author)

x <- as.data.frame(X) }

Problem: I used selectorgadget to get the correct HTML source. However, when I run the code, I do not get the required results.

Output I get:

1 Title/thread Starter

2 Sticky: ****Please use the search****

3 Sticky: **** The Official Atlas SUV DIY/FAQ thread****

Required output:

Threads Author Replies

Text name, date text

How do to get scrape these threads. Should I use rvest or is it through API/Json? I do I know how to go about it ?

web