lundi 26 octobre 2020

R: readLines on a URL leads to missing lines

When I readLines() on an URL, I get missing lines or values. This might be due to spacing that the computer can't read.

When you use the URL above, CTR + F finds 38 instances of text that matches "TV-". On the other hand, when I run readLines() and grep("TV-", HTML) I only find 12.

So, how can I avoid encoding/ spacing errors so that I can get complete lines of the HTML?




Aucun commentaire:

Enregistrer un commentaire