Image Recognition exercise (character verification code, license plate number, ID card number)

Source: Internet
Author: User
Image Recognition exercise (character verification code, license plate number, ID card number)

Byoy 2012

You are welcome to discuss related issues with me. Contact: 1429013154

Code here (note that this version is not the final version)

Optical character recognition (OCR) is a very useful technology. In terms of verification code recognition, license plate number recognition, and text recognition, character-based recognition technology is relatively easy to use (compared with text recognition ).

When I saw a friend studying the verification code recognition, I had a hand itch, and byoy did it on his own. Of course, it must be a simple verification code.

The verification code is actually not limited. It can also identify the license plate number, ID card number, house number, and other messy content.

The identification process is clear:

1. Pre-processing images

2. Perform Y-axis projection.

3. Analyze histogram partitioning

4. Split the image into multiple characters based on the partition (the key is, the better the split, the higher the recognition rate in the future)

5. Discard blank or invalid characters

6. Automatically rotate characters (if skewed) to recognize characters

If the image in the sample has adhesion, the partition may be inaccurate. In this case, it is difficult to rotate automatically.

Currently, characters can be separated. Next, we will study how to identify them. (If a single character is more standard, you can use the ready-made OCR Control)

Here are some examples.

Common Verification Code (no difficulty)

Verification code with interference

High-Intensity Interference (currently, the partition algorithm cannot be used and better algorithms, such as dynamic thresholds, are needed)

Csdn Verification Code (No pressure)

ID card number

License plate number

Add a QQ Verification Code. It is difficult to identify using a single threshold. It must be determined based on the character width.

This is the result of a single threshold partition (no limit on the width), and the effect is poor.

Continue to study how to optimize the Partition Algorithm and how to recognize a single text (multiple recognition + sample training can be considered ).

 

The verification code of the Pacific website is attached.

Some adhesion, but can be solved by fixed character width (basically the same width)

Refer to this figure (obtain the entire width, divide by the number of characters to get each width, extracted separately)

Binarization the Otsu algorithm I used. References: "A threshold selection method from gray-level histograms", IEEE Trans. systems, man and cybernetics 9 (1), pp. 62-66,197 9

For the verification code, this article is very good. For details, refer to: "text-based CAPTCHA strengths and weaknesses", ACM computer and communication security 2011 (CSS '123)

Byoy 2012

Improved decontamination Algorithm

 

Sewage license plate number split characters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.