Last Update: 29 Jun, 2012
Byoy 2012
You are welcome to discuss related issues with me. Contact: 1429013154
Note: currently, only simple character splitting is implemented. Other studies were interrupted.
Source code download: Click the open link
Optical Image recognition (OCR) is a very useful technology. In terms of verification code recognition, license plate number recognition, and text recognition, character-based recognition technology is relatively easy to use (compared with text recognition ).
When I saw a friend studying the verification code recognition, I had a hand itch, and byoy did it on his own. Of course, it must be a simple verification code.
The verification code is actually not limited. It can also identify the license plate number, ID card number, house number, and other messy content.
The identification process is clear:
1. Pre-processing images
2. Perform Y-axis projection.
3. Analyze histogram partitioning
4. Split the image into multiple characters based on the partition (the key is, the better the split, the higher the recognition rate in the future)
5. Discard blank or invalid characters
6. Automatically rotate characters (if skewed) to recognize characters
If the image in the sample has adhesion, the partition may be inaccurate. In this case, it is difficult to rotate automatically.
Currently, characters can be separated. Next, we will study how to identify them. (If a single character is more standard, you can use the ready-made OCR Control)
Here are some examples.
Common Verification Code (no difficulty)
Verification code with interference
High-Intensity Interference (currently, the partition algorithm cannot be used and better algorithms, such as dynamic thresholds, are needed)
Csdn Verification Code (No pressure)
ID card number
License plate number
Add a QQ Verification Code. It is difficult to identify using a single threshold. It must be determined based on the character width.
This is the result of a single threshold partition (no limit on the width), and the effect is poor.
Continue to study how to optimize the Partition Algorithm and how to recognize a single text (multiple recognition + sample training can be considered ).
Byoy 2012