web: urllib2.urlopen(url).read() fails to read the URL content

mercredi 23 décembre 2015

urllib2.urlopen(url).read() fails to read the URL content

I am trying to read the web content of the link: http://ift.tt/1U6jZqN using following python command:

import requests
import urllib2
hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11'}
url = http://ift.tt/1U6jZqN
req = urllib2.Request(url, headers=hdr)
page = urllib2.urlopen(req).read()

Print page----- gives the following output:

<!DOCTYPE html>
<head>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<meta http-equiv="cache-control" content="max-age=0" />
<meta http-equiv="cache-control" content="no-cache" />
<meta http-equiv="expires" content="0" />
<meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT" />
<meta http-equiv="pragma" content="no-cache" />
<meta http-equiv="refresh" content="10; url=/distil_r_captcha.html?Ref=/Mobile-Phones/y149&amp;distil_RID=97C53AFC-AA02-11E5-B76A-8C12C4D2AB6C&amp;distil_TID=20151224055301" />
<script type="text/javascript">
    (function(window){
        try {
            if (typeof sessionStorage !== 'undefined'){
                sessionStorage.setItem('distil_referrer', document.referrer);
            }
        } catch (e){}
    })(window);
</script>
<script type="text/javascript" src="/QkrDIV1cexsvzwdadarecara.js" defer></script><style type="text/css">#d__fFH{position:absolute;top:-5000px;left:-5000px}#d__fF{font-family:serif;font-size:200px;visibility:hidden}#qttwcrxueetv{display:none!important}</style></head>
<body>
<div id="distil_ident_block">&nbsp;</div>
</body>
</html>

Is there any workaround to get the actual url content to be read. Any help is appreciated. Thanks in advance!!

web

mercredi 23 décembre 2015

urllib2.urlopen(url).read() fails to read the URL content

Aucun commentaire:

Enregistrer un commentaire