vendredi 2 mars 2018

Java: Extracting data from a website table

I'm trying to skim data from the NYSE website. There's a table (though its not formatted like an HTML table, but rather with div's inside div's inside div's) with data points I want to analyze. So I have the following method to actually load and begin to parse the html:

public static void skim() throws IOException {

    URL url = new URL("https://www.nyse.com/quote/XNYS:JNJ");
    java.io.InputStream is = url.openConnection().getInputStream();
    int ptr = 0;
    StringBuffer buffer = new StringBuffer();

    while ((ptr = is.read()) != -1) {
        System.out.print((char)ptr);
        buffer.append((char)ptr);
    }

}

It works all well and good, but the problem is that the HTML it prints out is a little different from the HTML I see when I do inspect element. The actual data itself seems to be missing. So I guess it loads separately from the framework HTML. So how do I actually get the data points from the website? Is there a certain way I should be loading the webpage or what?




Aucun commentaire:

Enregistrer un commentaire