dimanche 6 mars 2016

Accessing a logged in html

Recently, I decided to try and create a java program that would retrieve the source code of a webpage after logging into the site. I've searched SO threads fairly tediously and found many (helpful) answers, but am having trouble understanding why my code will only return a non-logged-in Document for the webpage. I assume there is an error with using cookies from the log in, but am not sure. All help is appreciated!

import java.io.*;
import java.net.MalformedURLException;
import java.util.Map;
import java.net.*;
import org.jsoup.Connection;
import org.jsoup.Connection.Method;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

public class PageLogin {

private static Map<String, String> loginCookies;

public PageLogin() {
    login();
}

private static void login() {
    try {
        Connection.Response res = Jsoup.connect("http://ift.tt/1UJE3SF")
                .data("e-mail",       "myEmail")
                .data("password",       "myPass")
                .method(Method.POST)
                .execute();


        loginCookies = res.cookies();
    } catch (MalformedURLException ex) {
        System.out.println("The URL specified was unable to be parsed or uses an invalid protocol. Please try again.");
        System.exit(1);
    } catch (Exception ex) {
        System.out.println(ex.getMessage() + "\nAn exception occurred.");
        System.exit(1);
    }
}

public  Document getDoc(String url){
    try {
        return Jsoup.connect(url)
                    .cookies(loginCookies)
                    .get();

    } catch (IOException e) {}

    return null;
}



public static void main(String[] args) throws IOException {

PageLogin test1 = new PageLogin();
Document doc = test1.getDoc("http://ift.tt/1X3TB2g");   
System.out.println(doc);

    }}




Aucun commentaire:

Enregistrer un commentaire