Python crawler script Login to GitHub and view information

Source: Internet
Author: User

Introduction to the analysis of the target website Login mode

Destination Address: Https://github.com/login

Login method to make analysis:

First, the form form forms the way to submit information,

Second, there are Csrf_token,

Third, a cookie that requires the first GET request when a user name and password are sent as a POST request

Finally, after the login is successful, the request for other pages is only required with the first successful login to return the cookie.

  

Get the tokens and cookies we want with a request sent by get

Code:

Import requests from BS4 import BEAUTIFULSOUPR1 = Requests.get (' https://github.com/login ') soup = BeautifulSoup (R1.text, features= ' lxml ') #生成soup object S1 = soup.find (name= ' input ', attrs={' name ': ' Authenticity_token '}). Get (' value ') # Find out what we want. Tokenr1_cookies = R1.cookies.get_dict () # The next time the user name is submitted cookie# print (r1_cookies) # print (S1) #结果:: {' logged_in ': ' No ' , ' _gh_sess ': ' Vdfwa2hjwjfmb1hpruflrdvhumc3mxg1tk02tdhsunhdmerungpyt2y4stlqz2xcv1lczefhk21wdfr1bkpgyuv0wejzcdeydwfzcm93
Avc4nk91q2jicmtrv0niq0lrswm4afhrsvfybctcczbwdnhvn0yysvjjnufpqnhytznurkjwndjzuwxucek2m2jkm3vsmddxvhnoy1htqkthckjqzdjyuvr2r Zbnuku3vnltrvf2u
M1admu3c3yzsglyvnvzvm0ycna1euhet1jrvwnln0psbndkwjljmgttng5urwj1eu8rqjzxnemxvethcgvobdfby2gvc2zzwxcvwwzab29wqwjyu0l6cmzscw Hbqulzyta3dtrtb
3l1s0hdyythy2v1suhewlzvvlzoswzptzbjnmlidff2dzi2bwgtltjon1lqbm5jwutsymtivem1cljpake9pq%3d% 3d--897dbc36c123940c8eae5d86f276dead8318fd6c '} prz0wapebu5shksgcesn0fijwou9alw8epusxlqgcw1ezirl0vbskvktyqie8vhxhph2h/uzgav6xx+yjtgova==

To get these two values, proceed to the next send login request:

second Post method to submit user name password

Code::

This code goes on the GET request above, just the part of the POST request, r2 = Requests.post (    ' https://github.com/session ',    data ={        ' commit ': ' Sign in ',        ' utf8 ': '? ', '        authenticity_token ': S1,        ' login ': ' [email protected] ',        ' password ': ' username password '                  # Fill in the correct username    },    cookies = R1.cookies.get_dict (), # The       first cookie is required here) print (R2.cookies.get_dict ())      # This is a cookie after success.

  

after success, return to the login page information.

View personal details page Based on successful post login.

Only a cookie with a successful login will be required here.

  #完整代码import requestsfrom bs4 Import BeautifulSoupr1 = Requests.get (' https://github.com/login ') soup = BeautifulSoup ( r1.text,features= ' lxml ') S1 = soup.find (name= ' input ', attrs={' name ': ' Authenticity_token '}). Get (' value ') R1_cookies = R1.cookies.get_dict () print (r1_cookies) print (s1) r2 = requests.post (    ' https://github.com/session ',    data ={        ' commit ': ' Sign in ',        ' utf8 ': '? ', '        authenticity_token ': S1,        ' login ': ' [email protected] ',        ' Password ': ' Password '    },    cookies = R1.cookies.get_dict (),) view personal details page print (r2.cookies.get_dict ()) R3 = Requests.get (       ' https://github.com/13131052183/product ',   #查看个人的详情页        cookies = r2.cookies.get_dict ()) Print ( R3.text)

  

Python crawler script Login to GitHub and view information

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.