dimanche 30 octobre 2016

Darknet crawler python

Im trying to write a simple web crawler to "Darknet" web. my first step is to get with python script to to darknet. I tried many answers but non of them worked.

what I did: I installed Tor docker as root.
I succeeded to get to this site with regular browser after the right configurations.
I succeeded to get check.torproject.org with my script
I am running with ubuntu 16.04 on VM

my code now is:

import socks
import ssl
import requests.cert

s = socks.socksocket()
s.setproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", port=9050)
s.connect(('http://ift.tt/2f1FwnC', 443))
ss = ssl.wrap_socket(s, cert_reqs=ssl.CERT_REQUIRED, ca_certs="requests.cert.where()")

print "Peer cert: ", ss.getpeercert()

ss.write("""GET / HTTP/1.0\r\nHost:http://ift.tt/2fj1aaC""")

content = []
while True:
    data = ss.read()
    if not data: break
         content.append(data)


ss.close()
content = "".join(content)
assert "This browser is configured to use Tor" in content

I think that my problem now is because of https instead of http (should I change port??)

any better solution? any explnation how to do it? Thanks

I checked all this questions and non of them worked - Python urllib over TOR? , How to route urllib requests through the TOR network? , Using SocksiPy with SSL .
(I saw in theit comments that I am not the only one it did not worked for him..)



all I need is to get this "discussion" pastes..




Aucun commentaire:

Enregistrer un commentaire