Character Recognition exercise (Verification Code, license plate number, ID card number, etc)

Source: Internet
Author: User

Last Update: 29 Jun, 2012

Byoy 2012

You are welcome to discuss related issues with me. Contact: 1429013154

Note: currently, only simple character splitting is implemented. Other studies were interrupted.


Source code download: Click the open link

Optical Image recognition (OCR) is a very useful technology. In terms of verification code recognition, license plate number recognition, and text recognition, character-based recognition technology is relatively easy to use (compared with text recognition ).

When I saw a friend studying the verification code recognition, I had a hand itch, and byoy did it on his own. Of course, it must be a simple verification code.

The verification code is actually not limited. It can also identify the license plate number, ID card number, house number, and other messy content.

The identification process is clear:

1. Pre-processing images

2. Perform Y-axis projection.

3. Analyze histogram partitioning

4. Split the image into multiple characters based on the partition (the key is, the better the split, the higher the recognition rate in the future)

5. Discard blank or invalid characters

6. Automatically rotate characters (if skewed) to recognize characters

If the image in the sample has adhesion, the partition may be inaccurate. In this case, it is difficult to rotate automatically.

Currently, characters can be separated. Next, we will study how to identify them. (If a single character is more standard, you can use the ready-made OCR Control)

Here are some examples.

Common Verification Code (no difficulty)

Verification code with interference

High-Intensity Interference (currently, the partition algorithm cannot be used and better algorithms, such as dynamic thresholds, are needed)

Csdn Verification Code (No pressure)

ID card number

License plate number

Add a QQ Verification Code. It is difficult to identify using a single threshold. It must be determined based on the character width.

This is the result of a single threshold partition (no limit on the width), and the effect is poor.

Continue to study how to optimize the Partition Algorithm and how to recognize a single text (multiple recognition + sample training can be considered ).

Byoy 2012

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.