Preface
Verification code? Can I hack it too?
On the introduction of the verification code is not much to say, a variety of verification code in people's lives from time to time will come out, as students daily contact the most is the Dean's Office system verification Code, such as the following verification code:
Identification method
Analog landing has complicated steps, and here we are only responsible for returning an answer string based on an input CAPTCHA image.
We know that the verification code in order to make interference, will make the picture into a colorful appearance, and we first is to remove these disturbances, this step needs to be constantly tested, enhance the color of the picture, increase contrast and so can help.
After a variety of pictures of the operation, finally found a more perfect solution to eliminate interference. We can see that after removing the interference, we will get a very pure black and white character picture in the best case. A picture has four characters, there is no way to put four characters all at once, you need to cut the picture, cut to each small picture only one character, and then each image to identify.
The next step is to identify the text, we first convert the resulting small graph into a matrix of 01, each representing a character.
Like the matrix of the number six.
num_6=[ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0 , 0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1 , 1,1,1,1,1,0,0,0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,1,1,1,0,0 , 0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 , 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,]
Far from the past, squinting eyes still can distinguish out.
Because the verification code is very regular, each number is located in a fixed position, so there is no need to involve any machine learning algorithm, just a simple matrix comparison can be done, in all the implementation of the matrix to find the highest similarity of the matrix can be, here is a variety of comparison methods, Anyway, the data can be easily recognized correctly.
At this point, our verification code identification work is over.
This time the verification code identification is mainly used in Python pil for the picture operation, the simulation login automatically fill in the verification code of all the code, see here:
Sample code
#-*-Coding:utf-8-*import sysreload (SYS) sys.setdefaultencoding ("Utf-8") import reimport requestsimport ioimport osimp ORT jsonfrom PIL Import imagefrom PIL import imageenhancefrom bs4 import beautifulsoupimport mdataclass student:def __ini T__ (self, user,password): Self.user = str (user) Self.password = str (password) SELF.S = requests. Session () def login (self): url = "Http://202.118.31.197/ACTIONLOGON." appprocess?mode=4 "res = self.s.get (URL). Text imageUrl = ' http://202.118.31.197/' +re.findall (' his): His=r chrr=i VERIFYCODE+=CHRR # print "Auxiliary input verification Code complete:", Verifycode data= {' Webuserno ': Str (self.user), ' Password ': Str (self.password), ' agnomen ': Verifyco De,} URL = "Http://202.118.31.197/ACTIONLOGON." appprocess?mode=4 "T = Self.s.post (url,data=data). text If Re.findall (" Images/logout2 ", T) ==[]: l = ' [0, ' ' +re.findall (' a Lert (. +?)); ', t) [1][1][2:-2]+ ' "] ' +" "+self.user+" "+self.password+" \ n "# print L # return ' [0, ' ' +re.findall (' alert (. + ?)); ', T] [1][1][2:-2]+ '] ' return [false,l] else:l = ' Login successful ' +re.findall ('! (.+?) ', t) [0]+ "" +self.user+ "" +self.password+ "\ n" # print L return [true,l] def getInfo (self): ImageUrl = ' http://202.118. 31.197/actiondspuserphoto. Appprocess ' data = Self.s.get (' Http://202.118.31.197/ACTIONQUERYBASESTUDENTINFO. Appprocess?mode=3 '). Text #学籍信息 data = BeautifulSoup (dATA, "lxml") Q = Data.find_all ("table", attrs={' align ': "Left"}) a = [] for i in q[0]: if Type (i) ==type (q[0]): for J In I:if type (j) ==type (i): A.append (J.text) for I in q[1]: if Type (i) ==type (q[1]): for J in I:if Type (j) ==type (i): A.append (j.text) data = {} for I in range (1,len (a), 2): data[a[i-1]]=a[i] # data[' photo '] = io. Bytesio (Self.s.get (IMAGEURL). Content) return Json.dumps (data) def getpic (self): ImageUrl = ' http://202.118.31.197/ Actiondspuserphoto. Appprocess ' pic = image.open (io. Bytesio (Self.s.get (IMAGEURL). Content)) return pic def getscore (self): score = Self.s.get (' http://202.118.31.197/ Actionquerystudentscore. Appprocess '). Text #成绩单 score = BeautifulSoup (score, "lxml") Q = Score.find_all (attrs={' height ': "$"}) [0] point = q.t Ext Print point[point.find (' average GPA '):] table = score.html.body.table people = table.find_all (attrs={' height ': ' 36 '} [0].string r = Table.find_all (' table ', attrs={' align ': ' Left '}) [0].find_all (' tr ') Subject = [] Lesson = [] for i in r[0]: if Type (R[0]) ==type (i): Subject.append (i.string) for I in r:k=0 temp = {} for J in I:if type (r[0]) ==type (j): Temp[subject[k]] = j.string k+=1 lesson.append (temp) Lesson.pop () lesson.pop (0) return Json.dumps (Lesson) def logoff (self): return Self.s.get (' Http://202.118.31.197/AC Tionlogout. Appprocess '). textif __name__ = = "__main__": A = Student (20150000,20150000) R = A.login () print r[1] If r[0]: R = Json.loa DS (A.getscore ()) for I in R:for J in I:print I[j], Print q = Json.loads (A.getinfo ()) for I in Q:print i,q[i ] A.getpic (). Show () A.logoff ()