Python parsing the simplest verification code

Source: Internet
Author: User
Recently learning Python, just meet the school need to choose a dorm, and Python wrote a robbery software. One of the modules is used to log in, the login needs to enter the verification code, but later found that directly can bypass the verification code directly landed bug. But this is another topic, at the beginning I did not find this hidden secret, so I wrote this Python code snippet to implement the function of parsing the captcha.

Our school's verification code is the simplest verification code, the form is probably as follows:

Where the size of this image is 60x24 pixels, the size of each number is 15x24 pixels.

Observing this verification code can be found that only the number of the code and the number of the font is very standard, but the color of each number is different.

There were 2 of ideas.

1. Slice the entire photo evenly into four points, one picture per digit, then scan each pixel of each photo, initialize a signature buff for each number, a byte size of 15x24, which is a total of 45Byte.

First take the background color, you can know (0,0) position is the background color. Then scan the number for each pixel and background color contrast if the same is 1 different then 0. The eigenvalues of the 0-9 10 characters are then analyzed. When you need to parse the verification code, the verification code image is directly partitioned to compare the eigenvalues with the standard characteristics of the value.

2. We can imagine that 0-9 of these 10 characters each character's glyph is different, it is possible for example 9 this number in the pixel (2,12) (1,13) This position is unique, that is, in the Shard picture if (2,12) location of pixels and background color consistent, Then the Shard picture must not be 9 otherwise it must be 9.

The above two methods have a bug is that the first number of the image has a certain offset, such as the number of other positions starting from the 3rd column, it may be from the 4th column, which I did not specifically analyzed. But there is a way to solve this, and the way I use it is from the first column of non-background color. Regardless of how the image is offset, its x-axis difference in the X-direction for its leftmost point is constant.

Finally my implementation method is to press the second, because this method is the fastest, only need to take the feature pixels at the point on it.

My method is this, first select the material picture three, including 0-9 of these 10 characters, and then verify that they each pixel and background color is consistent, if consistent then put this number in the corresponding pixel of the hash table.

Finally, the hash table is analyzed to find out which pixel is unique to 1 digits, which pixel is 2 digits unique, which pixel 3 digits unique, and finally parse the table.

Find a method that can uniquely determine a number, such as (0,18), (0,19) The two numbers can uniquely determine the number 1.

Then a hash dictionary is drawn:

When used, it is possible to determine the verification code value of this image by just clicking on the pixels.

The specific code is described below

1. The first is the code of the analysis, which is used to obtain the characteristic pixels of the number:

From PIL import Imageimport os# The path of the material picture path= "c:\\vaildpic\\" #取得材料图片images =os.listdir (path) holds the number of slices, 0-9 of the picture nubimgs=[ ] #存放背景色backpixels =[] #存放像素对应表pixDir ={} #首非背景色偏移值pixBlankEndPos =[] #这个函数用来取得这个图片中数字结构的偏移值def getlastblankposition ( materialpic,x=0): Bc=materialpic.getpixel ((0,0)) for I in range (all): for J in range: If Materialpic.getpixel ((i+x,j)) !=bc:return i# because just parse not write very rigorous, this place # get the picture of the destination folder for image in Images:if Os.path.isdir (path+image): continueimage= Image.open (path+image) #对于每张图片切成四份, save in the dictionary, get the corresponding background color, the first non-background color offset x, the next calculation with for I in range (4): Ma=image.crop ((i*15,0, (i+1) * 15,24)) Nubimgs.append (MA) backpixels.append (Image.getpixel ((0,0))) Pixblankendpos.append (Getlastblankposition (ma ) print pixblankendpos# for each pixel of each digital picture, if the corresponding position is a non-background color, place the picture in the dictionary of that location, the structure is as follows, then use the following statistics to obtain the characteristic pixels of each number "Pixdir[pixel" (x-x_ Offset,y), Imgseq]=picture
"For I in Range": for J in range: ai=noneaj=nonepixdir[(i,j)]={}for imgnum in range (nubimgs.__len__ ()): if ( Nubimgs[imgnum].getpixel ((I,J))!=backpixels[imgnum]):p ixdir[(i-pixblankendpos[imgnum],j)][imgNum]=nubimgs[ Imgnum] "" "Nubimgs[0].putpixel ((i,j), Nubimgs[imgnum].getpixel ((i,j))" "" "," "", "" "" "" to the corresponding folder "" "for the PIX in Pixdir.items (): If pix[1].__len__ () <=6:print pixi=0for pic in Pix[1].items (): I+=1if not os.path.exists (path+str (Pix [1].__len__ ())): Os.mkdir (Path+str (pix[1].__len__ ())) Pic[1].save (Os.path.join (Path+str ()), str ( Pix[0][0]) + "_" +str (pix[0][1]) + "__" +str (i) + ". bmp"))

Material Picture:


The parsing results are as follows


The corresponding folder is placed in the N-picture shared pixels, the next analysis I was the manual analysis, in fact, you can also use the program to write, but to tell the program beforehand which fragment is what number, you can use the image name as the corresponding verification code to resolve. Because this is the later thought, it has not been realized.

2. The next step is to use the resulting eigenvalues to parse the verification code

The following method is used to obtain the background color, the method is the same as the above resolution, along the top layer of the picture color, because the top is not drawn

def getbackcolors (BMP): List=[]for I in range: if Bmp.getpixel ((i,0)) not in List:list.append (Bmp.getpixel ((i,0))) Return list

As with the above resolution, get the first draw offset value

def getlastblankposition (materialpic,x=0): Bc=getbackcolors (materialpic) for I in range (all): for J in range: if Materialpic.getpixel ((I+X,J)) not in Bc:return I

Parse the verification code, use the characteristics to judge

def getvaildjpgnumber (BMP):p rint ' getvaildjpgnumber ' vaildstr= ""; backcolors=getbackcolors (BMP)
#对于一个验证码的4个数字分别验证, whose x range is n*15~ (n+1) *15for Pos in range (4):
#取得对应位置的首绘偏移值offset =getlastblankposition (bmp,pos*15)
#对于0-9, determine whether the corresponding feature is the background color, if it is not resolved to complete, is the background color to determine the next number, because 3 of the pixel basic and other image sharing, so if not found a specific number, is 3for nr in range (0,10): isthisnr= truefor pix in Numberkeypixel[nr]:if pix[0]+offset>=15:isthisnr=falsebreakif bmp.getpixel (pix[0]+offset+pos*15, PIX[1]) in Backcolors:isthisnr=falsebreak;if Isthisnr and numberkeypixel[nr].__len__ ()!=0:vaildstr+=str (NR) breakif vaildstr.__len__ () ==pos:vaildstr+= ' 3 ' Print Vaildstrreturn vaildstr

Fetch the verification code from the network, using the Httplib, where our school name I have replaced for MySchool

The above content to introduce you to the Python parsing the simplest verification code related knowledge, I hope you like.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.