jeudi 21 janvier 2016

Web scraping python post login

i'm write app to web scraping, I have only name file and I need a "hash id" from

>hash id

any suggestion?

<tr class="even">
    <td style="display:none">id&gt;277364359,id32&gt;hash id,level&gt;0,key_left&gt;0,key_right&gt;0,name&gt;main.py,type&gt;File py,size&gt;2.06 KB,hash&gt;8294d1f20939d431a6c9a07f6d0797a5</td>
    <td style="width:15px;"><input type="checkbox" class="select-checkbox" id="checkbox_277364359"/></td>
    <td><img src='/images/filemanager/file_small.png' alt=''> main.py</td><td>File py</td>
    <td class="td-for-select">2.06 KB</td><td style="width: 90px;">date</td>
    <td style="width: 80px;">-</td><td style="width: 50px;">0</td>
</tr>

python code:

    USERNAME = "xxxx"
    PASSWORD = "xxxx"
    LOGIN_URL = "http://ift.tt/UD1D8M"
    URL = "http://ift.tt/1NnEDO7"
    session_requests = requests.session()
    result = session_requests.get(LOGIN_URL)
    tree = html.fromstring(result.text)
    payload = { "LoginForm[email]": USERNAME,  "LoginForm[password]": PASSWORD, }
    result = session_requests.post(LOGIN_URL, data = payload, headers = dict(referer = LOGIN_URL))
    result = session_requests.get(URL, headers = dict(referer = URL))
    tree = html.fromstring(result.content)
    bucket_elems = tree.findall(".//<td[@class='&gt;']")
    bucket_names = [bucket_elem.text_content().replace("\n", "").strip() for bucket_elem in bucket_elems]
    print bucket_names

I have no idea how to do it




Aucun commentaire:

Enregistrer un commentaire