vendredi 28 février 2020

Collecting subpages on using R

pagelinklists <- NULL
for(i in (2:144)){
page <-        qq("https://forum.federalsoup.com/default.aspx?g=topics&f=4&p=@{i}")
pagenumber <- LinkExtractor(page)
pagelinklists <- c(pagelinklists,     pagenumber$InternalLinks)
}
pagelinklists <- gsub("amp;", "",     pagelinklists)
pagelinklists <- pagelinklists [str_detect(pagelinklists, "g\\=posts\\&t\\=\\d+$")]
pagelinklists <- unique(c(URLlist, pagelinklists))

for(i in (pagelinklists)){
pages <- LinkExtractor(i)
subpages <- c(subpages, pages$InternalLinks)
subpages <- gsub("amp;", "", subpages)
subpages <- subpages [str_detect(subpages, "g=posts\\&t=\\d+\\&p=\\d+$")]
}

I entered the above lines, which should spit out a variable of lists of URLs I’m looking for. However, there’s an Error: : '' does not exist in current working directory I’m not sure what to do now




Aucun commentaire:

Enregistrer un commentaire