Document directory
- Implementation of verification code recognition
Overview
A verification code is also called a graphic Code. It is a technology that prevents software from performing some automated operations. It is widely used in account registration and logon verification of various systems. To a certain extent, the verification code can prevent the software from automatically performing account and password guesses, registration, and other behaviors.
However, due to some reasons, we need to automatically perform the above-mentioned "forbidden" actions. At this time, there is a subject in front of us: verification code recognition technology, it can also be called Graphics Recognition Technology and pattern recognition technology.
Today's verification codes are no longer as simple as those in the early stages of the Internet. Early verification codes can be obtained through a simple two-dimensional matrix comparison, because the early verification codes use fixed font, and the output after simple processing, such as randomly generating some noise, and randomly modifying the character color.
Early graphical code characters were arranged neatly, and the number of characters was fixed. For example, the first character appeared at X: 10 y: 3, and the second character appeared at X: 25 Y: 3. The character size and character contour are unchanged, and the background color or fixed background color are not discussed in this article. This article will discuss character distortion, rotation, uncertain number of characters, uncertain character position, uncertain character size, and complex background (the background has multiple colors or even fades in color, or texture pattern) advanced verification code recognition technology. The content discussed in this article can not only be used in the field of website verification code recognition, but also widely used in industry and medicine, such as License Plate Recognition and OCR printing text recognition. However, we will focus on the verification code-based recognition technology.
Implementation of verification code recognition
The first step for recognizing complex graphics codes is Image Positioning and cutting. This includes how to separate the information part (the part containing the character contour) from the background and cut the characters, and the information is partially binarization (only black #000000 and white # ffffff ). If the image is not located or cut, the calculation workload is huge and the recognition error rate is huge.
For simple (monochrome) background image codes, the separation of background and foreground is easy to achieve. For verification codes with a multi-color background, gradient color, and texture image, it is very complicated to separate the background.
Here we should consider the way people think. When we see this verification code, we do not need to separate the background with the naked eye, first, we understood the foreground outlines (characters and numbers) we knew, and then removed the foreground. Then we found that the original background was a texture image.
Based on this idea, we should skip background separation and look for the foreground outline directly. If we can get the outline of the foreground (we are only interested in the foreground), we don't have to separate the background. If there is no clear outline or contrast between the background and foreground of the Verification Code, let alone the machine, even the human eyes cannot identify it.
Therefore, we set a reasonable "threshold" to split the image contour (the contrast is strong) by edge detection algorithm. However, it should be noted that the extracted part contains not only foreground (character outline) but also other fragmented parts, such as a background image or a part of a background image, or the hollow part of the character "0.
Edge detection algorithms can be summarized in a word image, that is, "Burning ".
1. First, it starts to burn from X: 0 y: 0. The adjacent pixels within the "threshold" are "ignited" and marked as "ignited ".
2. From the last pixel recursion Process of "ignment" 1, continue to ignite other adjacent points within the "threshold" limit. (Note that each recursive process compares the "threshold" based on the color difference between the last "ignment" pixel and the current pixel, instead of always using X: 0 Y: the "threshold" is calculated at 0 to process the gradient background color .)
3. When the first round of combustion (X: 0 y: 0) is all extinguished, the first edge detection is complete. Then, traverse the graph, search for pixels marked as "unignited" and repeat the "Burning" process from now on. Until all pixels are marked as "ignited ".
However, in practice, traversal functions are limited by the stack size. When processing large images (such as 300*300 images), Stack Overflow may occur due to excessive traversal depth, let your program go down, but you can use other methods, such as creating a "Burning task table" to solve this problem. The essence is traversal.
In general, we should discard the part with a split size greater than 50%, because such a large part can only be the background. Parts smaller than 5% after segmentation should also be discarded. This may be background texture, noise, or interference line. Based on Multiple Edge Detection learning, we can always give a reasonable value, remove most irrelevant split results.
After these filtering algorithms and precise edge detection, we can completely fill the split part into a monochrome for subsequent recognition. However, before the real recognition starts, we still need to perform some operations on these images.
1. The scaling operation scales the image to be recognized to a specified size. a scaling operation can ensure that the image details are not lost but as small as possible to accelerate the recognition speed, I use 24x24 here, mainly because the verification characters are too complex (handwritten and hollow-out). If they are too small, too many details will be lost.
2. The rotation operation detects the Rotation Angle of the image to be recognized, and reversely rotates it for recognition.
3. Image repair: After edge detection algorithms, the image may be damaged. For example, if there is a hollow point in the middle of the image, we need to use some algorithms to repair the image.
4. Adhesion and overlapping image detection, re-segmentation, or mask recognition. This is a very important step for complex verification codes and is also a common difficulty in the industry.
Please look forward to my next article: verification code recognition technology (2)-Further Processing After Edge Detection