Objective
Log on to the site, often encounter token parameters, token association is not difficult, it is difficult to find out the first time the server returned the value of the token where the location, taken out can be dynamically associated with
Login Pull-Hook net
1. First find the Login homepage https://passport.lagou.com/login/login.html, enter the account number and password login, grab the package to see the details
Find where token is generated
1. Open the login home https://passport.lagou.com/login/login.html, directly press F5 refresh (only do the refresh action, do not enter the account and password), and then from the returned page to find the location of the token generated
Look at the contents of the note:
</script> <!-- 页面样式 --> <!-- 动态token,防御伪造请求,重复提交 --> <script> window.X_Anti_Forge_Token = ‘286fd3ae-ef82-4019-89c4-9408947a0e26‘; window.X_Anti_Forge_Code = ‘74603111‘;</script>
Front-end code, comment content exposes the token location, hey!
2. Then parse the value of token and code two parameters from the returned HTML
Analog Login
1. Log in when the password parameters, although encrypted, but is a fixed encryption, so the direct copy of the packet capture encryption string on the line.
Reference code:
#coding: Utf-8
Import requests
From BS4 import BeautifulSoup
Import re
def login (S,GTOKEN,USER,PSW):
‘‘‘
S=requests.session ()
Gtoken:gettokencode return value
User: Account
PSW: Password
‘‘‘
URL2 = "Https://passport.lagou.com/login/login.json"
h2={
"User-agent": "mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) gecko/20100101 firefox/52.0 ",
"Content-type": "application/x-www-form-urlencoded; Charset=utf-8 ",
"X-requested-with": "XMLHttpRequest",
"X-anit-forge-token": gtoken[' X-anit-forge-token '),
"X-anit-forge-code": gtoken[' X-anit-forge-code '),
"Referer": "Https://passport.lagou.com/login/login.html"
}
S.headers.update (H2)
body={
"Isvalidate": "true",
"username": User, #传入user参数
"Password":p SW, #传入psw参数
"Request_form_verifycode": "",
"Submit": ""
}
r2 = S.post (url2, Data=body,verify=false)
Try
Print (R2.text)
Return R2.json ()
Except
Print ("Login exception information:%s"% r2.text)
Return None
def gettokencode (s):
Url= ' https://passport.lagou.com/login/login.html '
h={
"User-agent": "mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) gecko/20100101 firefox/52.0 "
}
#更新session的
S.headers.update (h)
Data=s.get (Url,verify=false)
Soup=beautifulsoup (data.content, "Html.parser", from_encoding= ' uft-8 ')
tokencode={}
Try
T=soup.find_all (' script ') [1].get_text () #找到我们需要的第2个script标签 and gets the text information that returns a string
Print (t)
tokencode[' X-anit-forge-token ']=re.findall (r "Token = ' (. +?) '", T) [0]
tokencode[' X-anit-forge-code ']=re.findall (r "Code = ' (. +?) '", T) [0]
Return Tokencode
Except
Print ("Get token and code failed")
tokencode[' X-anit-forge-token ']= ""
tokencode[' X-anit-forge-code ']= ""
Return Tokencode
If __name__== ' __main__ ':
S=requests.session ()
token = Gettokencode (s)
Login (S,token, "XX", "d4fb38a060abe164f6e2e1a71473329d")
Python Interface Automation-token Parameter association login (login Pull net)