vendredi 4 octobre 2019

Get specific information from Wikipedia Information Box

I'm trying to get the details of the latest release in the information box on the right side. I'm trying to retrieve "6.2 (Build 9200) / August 1, 2012; 7 years ago" from the box by scraping the page using jsoup.

I have code that pulls all data from the box but I can't figure out how to pull the specific part of the box.

org.jsoup.Connection.Response res = Jsoup.connect("https://en.wikipedia.org/wiki/Windows_Server_2012").execute();
String html = res.body();
Document doc2 = Jsoup.parseBodyFragment(html);
Element body = doc2.body();
Elements tables = body.getElementsByTag("table");
for (Element table : tables) {
    if (table.className().contains("infobox")==true) {
        System.out.println(table.outerHtml());
        break;
    }
}



Aucun commentaire:

Enregistrer un commentaire