Parse the simplest verification code and the simplest Verification Code
Recently I was learning python. I used python to write a software program to grab a dormitory when I needed to select a dormitory. One module is used to log on. You need to enter the verification code when logging on. However, you can directly bypass the verification code to log on directly. However, this is another topic. At the beginning, I did not find this hidden secret, so I wrote this python code segment to implement the verification code parsing function.
Our school's verification code is the simplest Verification Code in the following format:
The size of the image is 60x24 pixels, and the size of each number is about 15x24 pixels.
After observing the verification code, we can find that the verification code only contains numbers and the font of the numbers is standard, except that the color of each number is different.
There were two ideas at that time
1. slice the entire photo into an average of four parts, each number is an image, then scan each pixel of each photo, initialize a signature buff for each number, the size is 15 bytes, that is, a total of 45 bytes.
Take the background color first. You can see that the position (0, 0) is the background color. Then, Scan each pixel of the number and compare the background color. If the difference is 1, the value is 0. Then, the feature values of the 10 characters 0-9 are analyzed. When you need to parse the verification code, you can directly split the verification code image to retrieve the feature value and compare it with the standard feature value.
2. We can imagine that the 10 characters 0-9 have different fonts, for example9This number is exclusive to the position (). That is to say, if the pixels at the position () are consistent with the background color, the part image is not 9, otherwise it must be 9.
There is a bug in the above two methods that the first number of the image has a certain offset. For example, the number at other positions starts from column 3rd and may start from column 4th, I have no specific analysis. However, there is a way to solve this problem. the method I use is to calculate from the first column of non-background color. No matter how the image is offset, the x-axis difference to the x-direction of the leftmost vertex remains unchanged.
In the end, my implementation method is based on the second method, because this method is the fastest, you only need to take the points at the feature pixel.
My method is like this. First I select three material images, which contain the 10 characters 0-9, and then check whether each pixel is consistent with the background color, if they are consistent, place the number in the hash table corresponding to the pixel.
Finally, analyze the hash table to find out which pixel is unique to one number, which pixel is unique to two numbers, and which pixel is unique to three numbers. Finally, parse the table.
Find a unique method to determine a number, such as (0, 18), (0, 19) the two numbers can uniquely determine the number 1.
Then a hash dictionary is obtained:
NumberKeyPixel={ 0: [(7,10),(0,12),(0,10),(0,11),(0,8),(1,14),(1,15)], 1: [(4,8)], 2 :[(0,18),(0,19)], 3 :[], 4 :[(5,7)], 5 :[(0,4),(0,10)], 6 :[(2,6)], 7 :[(2,16)], 8 :[(0,12)], 9 :[(2,13)] }
When using the image, you only need to compare these pixels in sequence to determine the verification code value of the image.
The following describes the specific code 1. First, the code used for analysis to obtain the feature pixels of numbers:
From PIL import Imageimport OS # path for storing the material image = "C: \ vaildpic \" # obtain the material image images = OS. listdir (path) stores the slices of numbers, 0-9 image nubimgs = [] # store background color backpixels = [] # store pixel corresponding table pixDir ={}# first non-Background Color Offset Value pixblkendpos = [] # This function is used to obtain this image the offset def getlastblkposition (materialPic, x = 0): bc = materialPic. getpixel (0, 0) for I in range (15): for j in range (24): if materialPic. getpixel (I + x, j ))! = Bc: return I # The parsing is not strictly written. in this case, # obtain the image in the target folder for image in images: if OS. path. isdir (path + image): continue image = Image. open (path + image) # Cut each image into four parts and store it in the dictionary to obtain the corresponding background color. The first non-Background Color Offset x, and then calculate the for I in range (4 ): ma = image. crop (I * 15, 0, (I + 1) * 15, 24) nubimgs. append (ma) backpixels. append (image. getpixel (0, 0) pixBlankEndPos. append (GetLastBlankPosition (ma) print pixBlankEndPos # For each pixel of each digital image, if the corresponding position is not the background color, put the image in the dictionary of this position. Its structure is as follows, next we use the following data statistics to obtain the feature pixels for each number '''pixdir [pixel (x-x_offset, y), imgSeq] = picture
''' For I in range (15): for j in range (24): ai = None aj = None pixDir [(I, j)] ={} for imgNum in range (nubimgs. _ len _ (): if (nubimgs [imgNum]. getpixel (I, j ))! = Backpixels [imgNum]): pixDir [(I-pixBlankEndPos [imgNum], j)] [imgNum] = nubimgs [imgNum] "" nubimgs [0]. putpixel (I, j), nubimgs [imgNum]. getpixel (I, j) "" ''' only the pixels of n numbers are saved in the corresponding folder ''' for pix in pixDir. items (): if pix [1]. _ len _ () <= 6: print pix I = 0 for pic in pix [1]. items (): I + = 1 if not OS. path. exists (path + str (pix [1]. _ len _ (): OS. mkdir (path + str (pix [1]. _ len _ () pic [1]. save (OS. path. join (path + str (pix [1]. _ len _ (), str (pix [0] [0]) + "_" + str (pix [0] [1]) + "_" + str (I) + ". bmp "))
Material image:
The resolution result is as follows:
The corresponding folder contains n pixels shared by images. For the next analysis, I manually analyzed them. In fact, I can also use a program to write them, but I need to tell the program which part is the number in advance, you can parse the image name as the verification code. Because this is what comes to mind later, it will not be implemented.
2. Use the obtained feature value to parse the verification code.
The following method is used to obtain the background color. The method is the same as that for surface resolution. The color is taken along the top layer of the image because the top layer is not drawn.
def getBackColors(bmp): list=[] for i in range(60): if bmp.getpixel((i,0)) not in list: list.append(bmp.getpixel((i,0))) return list
Obtain the first-line offset value, just like the above-side parsing.
def GetLastBlankPosition(materialPic,x=0): bc=getBackColors(materialPic) for i in range(15): for j in range(24): if materialPic.getpixel((i+x,j)) not in bc: return i
Parse the verification code and use the features to determine
Def GetVaildJpgNumber (bmp): print 'getvaildjpgnumber' vaildStr = ""; backColors = getBackColors (bmp)
# Verify the four numbers of one Verification Code respectively. The value range of x is n * 15 ~ (N + 1) * 15 for pos in range (4 ):
# Get the first painting offset value of the corresponding position offset = getlastblkposition (bmp, pos * 15)
# For 0-9, determine whether the corresponding feature is the background color. If it is not completely parsed or the background color, determine the next number. Because 3 pixels are basically shared with other images, so if no specific number is found at the end, it is 3 for nr in range (): isthisNr = True for pix in NumberKeyPixel [nr]: if pix [0] + offset> = 15: isthisNr = False break if bmp. getpixel (pix [0] + offset + pos * 15, pix [1]) in backColors: isthisNr = False break; if isthisNr and NumberKeyPixel [nr]. _ len __()! = 0: vaildStr + = str (nr) break if vaildStr. _ len _ () = pos: vaildStr + = '3' print vaildStr return vaildStr
Capture the verification code from the network, using httplib, where my school name has been replaced by myschool
def GetVaildJpg (): print 'GetVaildJpg' headers={ 'Accept': 'image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5', 'Referer': 'http://zcc.myschool.edu.cn/', 'Accept-Language': 'zh-Hans-CN,zh-Hans;q=0.8,en-US;q=0.5,en;q=0.3', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko', 'Accept-Encoding': 'gzip, deflate', 'Host': 'zcc.myschool.edu.cn', 'DNT': '1', 'Connection': 'Keep-Alive', 'Cookie': sessionId } httpClient=httplib.HTTPConnection('zcc.myschool.edu.cn',80,timeout=300) httpClient.request("GET",'http://zcc.myschool.edu.cn/image.jsp',None,headers) response=httpClient.getresponse() '''print response.getheaders()''' stBmp=response.read() bmp=Image.open(BytesIO(stBmp)) bmp.save('D:\PROJECT\PYTHON\catchDorm\catch.bmp') '''bmp.show()''' return GetVaildJpgNumber(bmp)
Okay. Now everything is OK. The results are correct in dozens of tests.
Websites with simple verification Codes
Import namespace: system. text; byte [] B = new byte [100]; random r = new random (); int code; the four-digit for (int I = 0
Question about simple verification code by pressing the key genie
Even a simple verification code is displayed randomly. The click-Button wizard is just a tool for operation record, that is, after you set it, you can do what you want it to do and cannot identify the verification code.
For example, there are 10 verification codes with numbers ranging from 1 to 10.
You need to set up 10 scripts, and some verification mechanisms will not be entered for you after you input several errors,
Therefore, the key-pressing Genie cannot be used to crack the verification code. I have entered so many words, hope to adopt them!