利用cookie進行類比登入並且抓取失敗

來源:互聯網
上載者:User

標籤:5.0   通過   diy   hold   oct   views   lin   lock   sts   

首先是朋友發現每次對撞md5都要上網站登入然後進行對撞,感覺好麻煩,想寫一個指令碼,輸入md5值直接輸出

然後就上車了

1 類比登入

老規矩,先要提交表單,進行抓包(我用的fiddler)進行抓包,看見了post的表單,但心血來潮,發現每次類比登入都利用提交表單的形式好無聊,再加上前些日子寫web,就想利cookie試試。

可以看出,這個cookie中,

CNZZDATA3819543的ntime是時間,

user相當於session,其他都一樣,所以可以寫出類比登入的指令碼了

import requestsfrom bs4 import BeautifulSoupimport timeURL = ‘http://www.xxx.com/‘def get_html(url):    session = requests.session()    headers = {‘User_Agent‘: ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) ‘                             ‘AppleWebKit/537.36 (KHTML, like Gecko) ‘                             ‘Chrome/30.0.1581.2 Safari/537.36‘}    cookies = {‘ASP.NET_SessionId‘: "eqsrnjotcaj5qdf5kmrqwgpy",               ‘CNZZDATA3819543‘: "cnzz_eid=471312766-1484873928-&ntime=%d" % int(time.time()),               ‘FirstVisit‘: "",               ‘_test‘: "1",               ‘comefrom‘: "http://www.xxx.com/login.aspx",               ‘key‘: "",               ‘user‘: "kPXxHtwrSPpCMgZoXs2VrPuwuuCUrDz7dLq5R3/DBEP59eqYGYFa23AZdDPP1KDR9"                       "rblhGp0HWbYVkOsCg3QoRwWHIQESmZi4KqRlXxfnuZcFsrEta5SwAmrrvhpNvK"                       "ghSMRdyV7PTmKuagc7m8IZQ=="}

返回結果,進行解析html就可以得到使用者名稱郵箱:

之後就可以利用session進行GET或者POST

 

2 入坑,登入後的,進行md5的查詢,然後抓包

接著看錶單

 

分析表單:

__EVENTTARGET,__EVENTARGUMENT 這兩個值沒什麼用,每次的值都是""。

__VIEWSTATE 這個值很有用,它是一種密碼編譯演算法,結合了你查詢的加密值和某些我未找到的值作為參數的密碼編譯演算法(這是我沒有實現爬蟲的牆)

__VIEWSTATEGENERATOR 這個從字面上理解就是上面那個viewstate的產生器,我猜它的某種密碼編譯演算法(不管了,懶得看了)

ctl00$ContentPlaceHolder1$TextBoxInput 這就是我們輸入的需要解密的值

ctl00$ContentPlaceHolder1$InputHashType 這是我們選擇的它是通過了什麼加密,預設好像是md5

後面的值也什麼大用。

其實說白了只要__VIEWSTATE 和ctl00$ContentPlaceHolder1$TextBoxInput的值相對應並且匹配,那麼就沒問題了。

3 最後奉獻出我失敗的爬蟲
#!/usr/bin/python# -*- coding: utf-8 -*-import requestsfrom bs4 import BeautifulSoupimport timeURL = ‘http://www.xxx.com/‘def get_html(url):    session = requests.session()    headers = {‘User_Agent‘: ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) ‘                             ‘AppleWebKit/537.36 (KHTML, like Gecko) ‘                             ‘Chrome/30.0.1581.2 Safari/537.36‘}    cookies = {‘ASP.NET_SessionId‘: "eqsrnjotcaj5qdf5kmrqwgpy",               ‘CNZZDATA3819543‘: "cnzz_eid=471312766-1484873928-&ntime=%d" % int(time.time()),               ‘FirstVisit‘: "",               ‘_test‘: "1",               ‘comefrom‘: "http://www.xxxx.com/login.aspx",               ‘key‘: "",               ‘user‘: "kPXxHtwrSPpCMgZoXs2VrPuxxuCUrDz7dLq5R3/DBEP59eqYGYFa23AZdDPP1KDR9"                       "rblhGp0HWbYVkOsCg3QoRwWHIQESmZi4KqRlXxfnuZcFsrEta5SwAmrrvhpNvK"                       "ghSMRdyV7PTmKuagc7m8IZQ=="}    playloads = {‘__EVENTTARGET‘: "",                 ‘__EVENTARGUMENT‘: "",                 ‘__VIEWSTATE‘: "XrQ+lfRMi82hZRL/drrjo0zDnT6/XJxrr0iphlxrVrNfVusZC2UHmQL5"                                "i4TbbaD8N6zKVxODMamXqkA0k7T1qoNfW9dRGs/V6mEptB90XdBB4Qj1"                                "n1jGG/iw+p7BW4oHPanh8mWCH3G5ZWuZM4TADQoGwOuXna0OWtVK/x8k00"                                "+zZEwKXi0vI2T9OrysyhkZ8msq/yashFfMyDo+Qwqb3jNJWl8n844E9Kmb4"                                "gcBuBmifviw7jvRJjpVQNqDH+Cbee7gMEvFK4rtKxKcCkxIGNvC46F59rl"                                "62EfVX81NFVSD0dhGNnF7kP0WRWpcXZRoXrxd2HFodv5beAw8Gwe7IRHr59"                                "T8/GmiS3KVRMDXMG9OgAg13mZv9f/LogkuNmPeiIVz9fBifx2D2kUdQQfT5x"                                "T0wbqoGQnWqeQcEYndUCp5lA8kCID4V8p0TR3EfrzAHPlxPh7be8yNHL8iHu"                                "50wgxJ6BD2W3VoeF3lOShhkpnHYAeQf7TLaCCPtKleCboctIO6dbcgt1KD6S"                                "UvJZyWuRRxz/CBAGNEr6piRudKOgnGl+W9nBfJDS4wl3ao3Y3Rvuon0YMz68"                                "o+Ef4FOExM300T51rL5HF5e8zyw+V68ISvXAoHJmhzt64j+ht0jOUzLI1UTXo"                                "MOg894gucdsH8VOpVNPO5F+4/03JHqi8R4cSHnFu9U9gYpnGBhIhZuzzyiLHj"                                "a3gqyHzehKBlWq53eOhXJH/IfVjGZ9ltjZHi9smWCMonqvZRTm0vD6nKCsQWi"                                "JILUzb8YrI7xzYgjHihSEyYc3qi9ze6uSwUdeJbQdqKiGVWMWt+gRxi7JZDae"                                "SMfN3NvavFtXdyBVyI1KFuP9LBYDYEH1RD6HXqVsblH4C1dIAq7yQnu4L20OzI"                                "E841MIiwLdQVAQ9aAwD3wqvPqoBJfqbkMBKQ7xSiDF+FSRacJ/IHOAJkMoqKJe4LY"                                "Csh0tPK1tK1pW7xF/X+PtQCQQ+Ldin76t3bpeY2KAQeF5cXEP94DIYydiJBfn4zJv+D"                                "QBzb0zRabwy5GBB1YDY9Fxiw34G1rB18yOlTwl2bpFnUArplpB0TwfjGkA7Up2MCrOy"                                "s6oDDdRn+1AQOETo7Ych274ymw+ThCzUrJeVNPf5/X2FJCJpqeH0TRCSs+0fxbaljihS9"                                "p3t1WqTxTHWKsh4TsZBQsn90kSItZS/dGYhNH/XUVombBi92AhUrokHqQC4b0mGdIRFRzg"                                "6l2lF4VfZbDfIayTgnZbT+N9RwcduCZCRWcUupLLcKnCZHuqd7WStG33dTk9IT/5q2xf57G"                                "fRDxslLzN1VIDn8Wtcl494OJPSPqr5+FB8mTs24UjM+6IwgVNstkJFIH1urQWl31TVUg"                                "nhtrIQEs4MpyeeUUwlV2CCfxP+JTGbZsuMHdd/RDwp9xH28dGQD0cikU8RlCut/XThG"                                "W10bPC2akAXO5xmACNBhY9XKvyMzg8D43AFa3xAxV+e9lwPhNHIQCX7c6m/t5rQztzM"                                "+TiraaMMGXZVyjFic757VcJHlU5We8r7lWsKBRbrqnIEV6JMi8dzmb5rLYbBbLI4N9Q"                                "DIwy5r0HKDmepTjhZY3DIFLkdO9RakjAoiFUs2e9h+wPxBQGQ+UbyWXzfSWa8hXKSGL"                                "kw774/Et5XfCPVaDBkqPPzKlX3QoV5ptuRuDCwzLdXpuBePhme64x09L9XOmIYFdaGJ"                                "MXjw/tKRTv6AFgGLvZyso+Ch9XLI/j5abcaLyC/nSUdsxexRPkV/wRB5pSsaau43nMn"                                "iMpuAVVxwryPTGnnAO38vl26BAo73jlvNvmP0Av22/3P+A2CmCcJt6S5bH7Jcw6S6HJ"                                "QXWDtnFGg6sYCi6mzvwmYFcBEeVzOKHJ8f7TxP7n5CbNjXWnBguSFL1UzH83DTcij6s+1lctI"                                "fw4NIN7NU5P+qInfSRvBH3754GAuSApuLZHOp/9k8fkkxlA==",                 ‘__VIEWSTATEGENERATOR‘: "CA0B0334",                 ‘ctl00$ContentPlaceHolder1$TextBoxInput‘: "21232f297a57a5a743894a0e4a801fc3",                 ‘ctl00$ContentPlaceHolder1$InputHashType‘: "md5",                 ‘ctl00$ContentPlaceHolder1$Button1‘: "查詢",                 ‘ctl00$ContentPlaceHolder1$HiddenField1‘: "",                 ‘ctl00$ContentPlaceHolder1$HiddenField2‘: "gnSxKhU+42ESHE0pCcCyudmYfvxVL2+w4IhvdkwT37OI/"                                                           "QODVV7mdVAN9puROPjh"}    text = session.post(url, headers=headers, data=playloads, cookies=cookies).text    session.close()    return textdef parser_html(text):    soup = BeautifulSoup(text, ‘html.parser‘)    string_gen = soup.find(‘div‘, class_=‘main‘).find(‘table‘, id=‘table3‘).        find(‘span‘, id=‘ctl00_ContentPlaceHolder1_LabelAnswer‘).strings  #strings屬性返回一個產生器, 產生器返回的是一個iterable    result = list(string_gen)[0]    return resultif __name__ == ‘__main__‘:    text = get_html(URL)    print parser_html(text)
View Code

 

利用cookie進行類比登入並且抓取失敗

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.