Python verification code identification and processing instance, python Verification Code instance
I. Preparation and code example
(1) install PIL: After the download is an exe, double-click it to install it. It will be automatically installed in C: \ Python27 \ Lib \ site-packages,
(2) pytesser: After downloading and decompressing the package, run C: \ Python27 \ Lib \ site-packages (depending on the Python path you have installed) and create a new pytheeer. pth, the content is written to pytesser. Note that the content here must have the same name as the pytesser folder, which means the pytesser folder and pytesser. pth, And the content must be the same!
(3) download Tesseract OCR engine: extract the downloaded file, and replace the tessdata folder in the tessdata folder.
Ii. Verification
(1) Principle:
Verification Code Image Processing
Verification Code image recognition technology is mainly used to operate the pixels in the image. by performing a series of operations on the pixels in the image, the text matrix of each character in the verification code image is output.
- 1. Read Images
- 2. Image Noise Reduction
- 3. Image Cutting
- 4. Image text output
(2) Verify Character Recognition
The character recognition in the verification code is mainly implemented by machine learning classification algorithms. Currently, we use KNN (K-nearest algorithm) and SVM (SVM algorithm ), I will describe the applicable scenarios of these two algorithms in detail later.
- 1. Obtain the character matrix
- 2. Matrix entry Classification Algorithm
- 3. output results
The image to be verified is as follows:
(3) simple commands:
from pytesser import * image = Image.open('1.jpg') # Open image object using PIL print image_to_string(image) # Run tesseract.exe on image
Then run:
Or directly:
print image_file_to_string('fnord.tif')
The same result can be output!
(4) complicated
The above can only be used for some relatively simple operations.
Principle: color to gray, gray to binary, binary image recognition
# Verification code recognition. This program can only recognize the data verification code import Image import ImageEnhance import ImageFilter import sys from pytesser import * # binarization threshold = 140 table = [] for I in range (256): if I <threshold: table. append (0) else: table. append (1) # because they are numbers # use this table to recognize letters to correct rep = {'O': '0', 'I': '1 ', 'L': '1', 'z': '2', 's': '8'}; def getverify1 (name): # Open the Image im = Image. open (name) # convert to grayscale map imgry = im. convert ('l') # Save the image imgry. save ('G' + name) # binarization, using the threshold segmentation method, threshold is the split point out = imgry. point (table, '1') out. save ('B' + name) # recognize text = image_to_string (out) # recognize text = text. strip () text = text. upper (); for r in rep: text = text. replace (r, rep [r]) Export out.save(text}'.jpg ') print text return text getverify1('1.jpg') # note that the image here must be in the same directory as the file, or upload an absolute path.
Effect after running:
The above is all the content of this article, hoping to help you learn.
Articles you may be interested in:
- Python image Verification Code
- Python image verification code sharing
- Example code of a Chinese Verification Code randomly generated by Python
- Python adds recaptcha verification code for tornado
- Example of a random Verification Code (Chinese Verification Code) generated by python
- Python generates Verification Code instances
- Python implements the code for automatic login to websites with verification Codes
- Python generates a 6-digit Verification Code randomly.