Python simple image recognition--Verification code Ⅲ
Implement automatic landing site
Log in to the school library management system For example, do a simple example. Python recognizes the simple non-interference of the pure digital verification code is still possible, but the recognition of alphanumeric and interference factors, false positive rate is very high, so this I use "artificial identification", manual input.
The first thing to understand is the role of cookies, which are data that some websites store on the user's local terminal in order to identify the user and track the session. So we need to use the Cookielib module to keep cookies on our website.
Login to school library Management system login (Http://122.207.221.227:8080/opac/login), Verification Code (HTTP://122.207.221.227:8080/KAPTCHA/GOLDLIB)
It can be found that this verification code is dynamic update every time the opening is different, generally this code and cookies are synchronized. To identify the verification code is definitely a thankless thing, so our idea is to first access the verification code page, save the verification code, get a cookie for login, and then directly to the login address post data.
Analyze the request and header information required for post on the login page
You can see that the URL you need to post is not a page to access, but rather (http://122.207.221.227:8080/pages/include/checklogin.jsp)
Where you need to submit the form data in username and password separate user names and passwords.
The above factors are analyzed, and the code is posted directly below.
#coding =utf-8from PIL Import imageimport pytesseractimport urllib2import urllibimport PIL. Imageopsimport requestsimport cookielibimport reimport sys ' Library login ' reload (SYS) sys.setdefaultencoding ("Utf-8") # Prevent chinese error url = ' http://122.207.221.227:8080/pages/include/checklogin.jsp ' Capchaurl = ' http://122.207.221.227:8080/ kaptcha/0.5458022691509324 ' cookie = cookielib. Cookiejar () # binds cookies to a opener cookie automatically managed by Cookielib handler = Urllib2. Httpcookieprocessor (cookie) opener = Urllib2.build_opener (handler) username= ' xxxxx ' password= ' xxxxx ' #用户名, password Callno = ' Callno ' picture = Opener.open (Capchaurl). Read () # access the CAPTCHA address with openr, get cookielocal = open (' C:\Users\ww\Desktop\goldlib.jpg ', "WB") # Save the CAPTCHA to the local local.write (picture) local.close () Secrecode = Raw_input (' Yanzhengma: ') # Enter the captcha postdata = {' Usern Ame ': username, ' password ': password, ' logintype ': Callno, ' Kaptcha ': Secrecode,} # Grab packet information construct form headers = {' Accept ': ' */* ' , ' accept-encoding ': ' gzip, deflate ', ' accept-language ': ' zh-cn,zh;q=0.8,zh-tw;q=0.7,zh-hk;q=0.5,en-us;q=0.3,en;q=0.2 ', ' Connection ': ' keep-alive ', ' content-length ': ' A ', ' content-type ': ' application/ X-www-form-urlencoded ', ' Host ': ' 122.207.221.227:8080 ', ' Referer ': ' Http://122.207.221.227:8080/opac/login ', ' User-agent ': ' mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:59.0) gecko/20100101 firefox/59.0 ', ' x-requested-with ': ' XMLHttpRequest ',} # construct Headersdata = Urllib.urlencode based on packet capture information (postdata) # Generate post data? key1=value1&key2=value2 form request = Urllib2. Request (Url,data,headers) #构造request请求try: Response = opener.open (Request) result = Response.read (). Decode (' Utf-8 ') Print Resultexcept urllib2. Httperror, E:print E.code
Demo results
Python simple image recognition--Verification code Ⅲ