Python crawler script Login to GitHub and view information

Last Update:2018-07-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction to the analysis of the target website Login mode

Destination Address: Https://github.com/login

Login method to make analysis:

First, the form form forms the way to submit information,

Second, there are Csrf_token,

Third, a cookie that requires the first GET request when a user name and password are sent as a POST request

Finally, after the login is successful, the request for other pages is only required with the first successful login to return the cookie.

Get the tokens and cookies we want with a request sent by get

Code:

Import requests from BS4 import BEAUTIFULSOUPR1 = Requests.get (' https://github.com/login ') soup = BeautifulSoup (R1.text, features= ' lxml ') #生成soup object S1 = soup.find (name= ' input ', attrs={' name ': ' Authenticity_token '}). Get (' value ') # Find out what we want. Tokenr1_cookies = R1.cookies.get_dict () # The next time the user name is submitted cookie# print (r1_cookies) # print (S1) #结果:: {' logged_in ': ' No ' , ' _gh_sess ': ' Vdfwa2hjwjfmb1hpruflrdvhumc3mxg1tk02tdhsunhdmerungpyt2y4stlqz2xcv1lczefhk21wdfr1bkpgyuv0wejzcdeydwfzcm93
Avc4nk91q2jicmtrv0niq0lrswm4afhrsvfybctcczbwdnhvn0yysvjjnufpqnhytznurkjwndjzuwxucek2m2jkm3vsmddxvhnoy1htqkthckjqzdjyuvr2r Zbnuku3vnltrvf2u
M1admu3c3yzsglyvnvzvm0ycna1euhet1jrvwnln0psbndkwjljmgttng5urwj1eu8rqjzxnemxvethcgvobdfby2gvc2zzwxcvwwzab29wqwjyu0l6cmzscw Hbqulzyta3dtrtb
3l1s0hdyythy2v1suhewlzvvlzoswzptzbjnmlidff2dzi2bwgtltjon1lqbm5jwutsymtivem1cljpake9pq%3d% 3d--897dbc36c123940c8eae5d86f276dead8318fd6c '} prz0wapebu5shksgcesn0fijwou9alw8epusxlqgcw1ezirl0vbskvktyqie8vhxhph2h/uzgav6xx+yjtgova==

To get these two values, proceed to the next send login request:

second Post method to submit user name password

Code::

This code goes on the GET request above, just the part of the POST request, r2 = Requests.post (    ' https://github.com/session ',    data ={        ' commit ': ' Sign in ',        ' utf8 ': '? ', '        authenticity_token ': S1,        ' login ': ' [email protected] ',        ' password ': ' username password '                  # Fill in the correct username    },    cookies = R1.cookies.get_dict (), # The       first cookie is required here) print (R2.cookies.get_dict ())      # This is a cookie after success.

after success, return to the login page information.

View personal details page Based on successful post login.

Only a cookie with a successful login will be required here.

  #完整代码import requestsfrom bs4 Import BeautifulSoupr1 = Requests.get (' https://github.com/login ') soup = BeautifulSoup ( r1.text,features= ' lxml ') S1 = soup.find (name= ' input ', attrs={' name ': ' Authenticity_token '}). Get (' value ') R1_cookies = R1.cookies.get_dict () print (r1_cookies) print (s1) r2 = requests.post (    ' https://github.com/session ',    data ={        ' commit ': ' Sign in ',        ' utf8 ': '? ', '        authenticity_token ': S1,        ' login ': ' [email protected] ',        ' Password ': ' Password '    },    cookies = R1.cookies.get_dict (),) view personal details page print (r2.cookies.get_dict ()) R3 = Requests.get (       ' https://github.com/13131052183/product ',   #查看个人的详情页        cookies = r2.cookies.get_dict ()) Print ( R3.text)

Python crawler script Login to GitHub and view information

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python crawler script Login to GitHub and view information

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python crawler script Login to GitHub and view information

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support