vendredi 22 novembre 2019

getting only symbols when downloading website with java

I usually use this to get the HTML of a website

public static void main(String[] args) {

    String website = "https://stackoverflow.com/";

    try {
        URL url = new URL(website);
        BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()));

        String line;

        while((line = br.readLine()) != null) {

            System.out.println(line);

        }

        br.close();

    } catch (MalformedURLException e) {
            System.out.println("Malformed URL: " + e.getMessage());
    } catch (IOException e) {
            System.out.println("I/O Error: " + e.getMessage());
    }

  }

and it works great. Recently I tried to download a page where I was getting only cryptic smybols like

A.]Ì¢5÷‰)†º¬ˆ®è&ûõdÀ´u5w䓳¡Zn¸4§÷Žtuë¡Pñ_MϦÎ@èÉGfp³~HïøHL×”6µ4SzEƒ¯æÌ¹É+®éÄÉ“ ¶Ð]1×ãôüã°Ñ<Þk6åº|B¯½o;úúà®Êñ¢Q?…Ôó¨ÆrÍ*^)Q@‘uⳫ7¯É—`ázªë  ›K~eôÞŒ•*7tøöK,ë3W'6ÍþVõ•›rb¿Óè¶÷òÂ.+èV&Úw£ødáÂü€jS¬’í’èÑ^4 Ò Š^s:Щý²«»TÈ~BâÝwùŠ?çwv
OÍo¯ûƒr<¹eé5H€aÓL¦ç˜œ?}'4?GŠoÔ›€ž¸Mþ?fÞ³²úˆX¿QàÅ7$ÊíO`ˇ‡°!dGH»ÒÅyõC.«ïì2cÜ$&®4íþp.1™`

but when I save the page using chrome everything seems fine. Is this some kind of protection the site could be doing? Or do I need to change some formats?

I used the link:

https://www.amazon.de/Stackoverflow-T-Shirt-Overflowing-Stack-Overflow/dp/B07KYZJGYR/ref=sr_1_1?__mk_de_DE=%C3%85M%C3%85%C5%BD%C3%95%C3%91&dchild=1&keywords=stackoverflow&qid=1574411393&sr=8-1




Aucun commentaire:

Enregistrer un commentaire