public void read(String matchLink) {
try {
String page = matchLink;
//Connecting to the web page
Connection conn = Jsoup.connect(page);
//executing the get request
Document doc = conn.get();
//Retrieving the contents (body) of the web page
String webPageText = doc.body().text();
String htmlCode = doc.body().html();
this.webPageText = webPageText;
this.htmlCode = htmlCode;
} catch (IOException ex) {
System.out.println(ex+" in read method");
}
}
I've been using Document.body().text() method, in order to get the raw text of a specific webpage. But it appears that it does not work on some webpages. Can you please recommend me a different approach? I've been browsing google, but I couldn't get a solid method. Thank you
Aucun commentaire:
Enregistrer un commentaire