Python verification code identification and processing instance, python Verification Code instance

Source: Internet
Author: User

Python verification code identification and processing instance, python Verification Code instance

I. Preparation and code example
(1) install PIL: After the download is an exe, double-click it to install it. It will be automatically installed in C: \ Python27 \ Lib \ site-packages,
(2) pytesser: After downloading and decompressing the package, run C: \ Python27 \ Lib \ site-packages (depending on the Python path you have installed) and create a new pytheeer. pth, the content is written to pytesser. Note that the content here must have the same name as the pytesser folder, which means the pytesser folder and pytesser. pth, And the content must be the same!
(3) download Tesseract OCR engine: extract the downloaded file, and replace the tessdata folder in the tessdata folder.

Ii. Verification
(1) Principle:
Verification Code Image Processing

Verification Code image recognition technology is mainly used to operate the pixels in the image. by performing a series of operations on the pixels in the image, the text matrix of each character in the verification code image is output.

  • 1. Read Images
  • 2. Image Noise Reduction
  • 3. Image Cutting
  • 4. Image text output

(2) Verify Character Recognition

The character recognition in the verification code is mainly implemented by machine learning classification algorithms. Currently, we use KNN (K-nearest algorithm) and SVM (SVM algorithm ), I will describe the applicable scenarios of these two algorithms in detail later.

  • 1. Obtain the character matrix
  • 2. Matrix entry Classification Algorithm
  • 3. output results

The image to be verified is as follows:

(3) simple commands:

from pytesser import * image = Image.open('1.jpg') # Open image object using PIL print image_to_string(image)  # Run tesseract.exe on image 

Then run:


Or directly:

print image_file_to_string('fnord.tif') 

The same result can be output!
(4) complicated
The above can only be used for some relatively simple operations.
Principle: color to gray, gray to binary, binary image recognition

# Verification code recognition. This program can only recognize the data verification code import Image import ImageEnhance import ImageFilter import sys from pytesser import * # binarization threshold = 140 table = [] for I in range (256): if I <threshold: table. append (0) else: table. append (1) # because they are numbers # use this table to recognize letters to correct rep = {'O': '0', 'I': '1 ', 'L': '1', 'z': '2', 's': '8'}; def getverify1 (name): # Open the Image im = Image. open (name) # convert to grayscale map imgry = im. convert ('l') # Save the image imgry. save ('G' + name) # binarization, using the threshold segmentation method, threshold is the split point out = imgry. point (table, '1') out. save ('B' + name) # recognize text = image_to_string (out) # recognize text = text. strip () text = text. upper (); for r in rep: text = text. replace (r, rep [r]) Export out.save(text}'.jpg ') print text return text getverify1('1.jpg') # note that the image here must be in the same directory as the file, or upload an absolute path.

Effect after running:


The above is all the content of this article, hoping to help you learn.

Articles you may be interested in:
  • Python image Verification Code
  • Python image verification code sharing
  • Example code of a Chinese Verification Code randomly generated by Python
  • Python adds recaptcha verification code for tornado
  • Example of a random Verification Code (Chinese Verification Code) generated by python
  • Python generates Verification Code instances
  • Python implements the code for automatic login to websites with verification Codes
  • Python generates a 6-digit Verification Code randomly.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.