About verification code recognition 1

Last Update:2014-08-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Get started with the first article

Composition of the first image and definition of the signature

As a verification code recognition, we need to first understand the basic principles. Here we will first clarify this principle, and then you will be able to understand the code later (in fact, it is relatively simple, I said so much for the purpose of reading this article. Burst into sweat again ~!)
The question is, in fact, we need to break down a graph for the whole verification code recognition. Every graph is actually composed of every vertex and every vertex. every vertex is actually a color block, when each color block is spliced into a graph, although this is a bit nonsense, I will mention it.
We need to better understand the figure:

Figure 1 Figure 2

Through the two pictures above, we can clearly know the color block situation. Each small block is a color block, and some are represented in white, the points of the entire graph are also identified by coordinates, X and Y, so we can easily distinguish rows and columns. When we take the color block X1, y1 refers to the first row, the first column. If the color block is X2, Y5 refers to the fifth row, and the second color block. Haha!

After learning about the color block, we can take the color in the color block as a comparison and judgment recognition standard. We color the numbers in an image. When the color value is black (0), we record the point as 1. When the obtained point value is white (255, we record the vertex as 0. In this way, the entire image is taken down and we will get a string in the following format:
0000000000000000000000000000000000000000000000000000000000000000000000000000000000111111111111111100001111111111111111001100000000000000001111000000000000000011110000000000000000111100000000000000001111000000000000000011110000000000000000111100000000000000001111000000000000000011001111111111111111000011111111111111110000000000000000000000000000000000000000000000000000000000000000000000000000000000

The above Binary-like string is what we have proposed from figure 1. We call this string the signature of this image. We can use the previous knowledge to interpret this feature. For example, for reference, the first four columns in Figure 1 are white, so we start to use a lot of zeros in our signatures.> _ <
Starting from the Fifth Column, except that the first two cells are white and then black, we should start counting them, that is, there should be 82 zeros, and then start with 1 ~~~ You can count it. Then we start to the black part. Here we should have 11 Black Blocks and 11 ones in our signatures. In this case, we will be very clear about the structure of this signature.

Well, now we can clearly interpret the image, and then we can use the code to implement it. To be continued (use code to extract signatures)

The second part uses the code to extract the signature in the image.

In the previous chapter, we talked about the image composition and the definition of the signature. If you do not understand it, please refer to the first part. Next we will start part 2 directly. In this chapter, we will use C # To write a winform program to extract the signature in the image.

In this chapter, we have three key points.
1,
Bitmap. getpixel (x, y) in bitmap // here is the color of the point in the image.
Note: To use bitmap, We need to reference two namespaces:
Using system. drawing;
Using system. Drawing. imaging;
　　
2. The first vertex is taken in the image, but the color of a vertex is determined by R. g. B is composed of three colors, so we need to know R. g. the color of B. Here, the image in our routine is relatively simple, and the text is black, and the color value of the Black RGB is (00000000255). So, we only need to take the R value, if it is composed of multiple colors in the text, it is best to use Photoshop to change to grayscale, and then take the value, so that the signature will be more accurate.

3. loop nesting is used for color values. One is used to take the row value and the other is used to take the column value. each point in the image needs to be scanned. The last step is to pay attention to "environmental protection ", close the image when it is used up. (*_*)

Now that we know what we need to use, we can start to write code.

There are a lot of code, so I just put some important code here, while the extra code can be viewed in the source process by myself. The source process will be included in the attachment, use simple code as much as possible in the source process, without classes or factory models. This is easy to understand, but it is not a good code, so if you do it yourself, pay attention to this problem.
// Load the image
Bitmap BMP = new Bitmap (drawing );
// Scan each column of each row to obtain the image numerical encoding characters
String codenumber = ""; // defines a string variable for storing the signature.
// Perform a point-by-point scan on the graph. If the R value is not equal to 255, the codenumber is recorded as 0; otherwise, the codenumber is recorded as 1.
For (INT x = 0; x <BMP. width; X ++) // row scan, from x.0 to X. Image Width
{
For (INT y = 0; y <BMP. height; y ++) // column scan, from y.0 to Image Height
{
If (BMP. getpixel (x, y). R = 0) // determines the point in the image. When the r color in the X and Y points is 0
{
Codenumber = codenumber + "1"; // The record is 1
}
Else // otherwise
{
Codenumber = codenumber + "0"; // The record is 0
}
}
}

// Close the image
BMP. Dispose ();
// Display the signature in the richtextbox1 Control
Richtextbox1.text = codenumber;

Each line has comments, so I will not talk about it. After the program is completed, we will record the signature. We need to use the verification code identification tool later. (For the next chapter to be continued, use the signature code to create the verification code reader)

Chapter 3 Creation of the verification code reader

In the previous chapter, we talked about the extraction of signatures and signatures. Now all we need is the identification of verification codes through signatures. Actually, smart friends have already guessed it, the recognition of this verification code is very clear here. Nothing special is to compare the pattern mentioned in each color block. The recognition process is a comparison process. Haha ~~~ In fact, I am trying to mix up an excellent post. Well, like in the previous chapter, I have provided less code here. If you need code details, go to the source process and check whether there are many comments in my source process, which should be easier to understand, in this example, I have less comments, because the focus is similar to that in the previous example.

Highlights of this chapter:
1. Use the extract generator created in the previous chapter to extract the pattern of the non-Miscellaneous films, and set ~ 9. Save the signature of individual numbers. At the beginning of the program, use a string number to save each signature. The format is as follows:
String [] features;
Features = new string [10];
Features [0] = "0000000000000000000000000000000000000000000000000000000000000000000000000000000000111111111111111100001111111111111111001100000000000000001111000000000000000011110000000000000000111100000000000000001111000000000000000011110000000000000000111100000000000000001111000000000000000011001111111111111111000011111111111111110000000000000000000000000000000000000000000000000000000000000000000000000000000000 ";
Features [1] = "0000000000000000000000000000000000000000000000000000000000000000000000000000000000110000000000000011001100000000000000110011000000000000001100110000000000000011111111111111111111111111111111111111111100000000000000000011000000000000000000110000000000000000001100000000000000000011000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 ";
............

2. Pay attention to the RGB color value of the Color Block in the image. If the value of the Color Block in the background color of the image to be read and the value of the Color Block in the text in the image are the same, this can be achieved through photoshop or other screen color software.

3. Pay attention to the length and height of the image. The verification code extracted above indicates the length and height of each single number. Here, the length and height of the entire image are required, it must be segmented into numbers for verification. The total length of the image in our display column is 120, containing 6 digits. The extracted signature is 20*20, that is, the total length must be divided into 6 parts, the height of each digit is 20 and the length is 20. The entire image can be obtained after 6 cycles.

4. Pay attention to the error in your image. If there is no noise, your error point may be 0, if there are any miscellaneous points, you need to determine the number of errors in your number and the number of errors in the signature. In this way, we can make better judgments. It can also improve accuracy.

Well, the above is the focus of this chapter. Now we start to continue our program.

The first thing we need to do is to make the signature into a string array, which is provided above. Here we will not repeat it, and then what we need is to load the image, the uploaded image here is the image of the verification code to be recognized. Then we can read this image. What we read here is not the whole image, but the image is divided into 20*20 ranges for reading, each time a 20*20 range is read, a judgment is made. You can use a loop. If you do not know, you can see the source process in the attachment later.

After reading the image signature, we will compare the signature in our image. First, we will compare the string length. When the string length is not equal, we do not need to judge it, because this is not possible, skip and cannot be identified. -_-!!!
When the verification code is of the same length, it starts to judge whether the two strings are completely equal. If they are completely equal, a number is output directly, indicating that the number is not complex, except for the background, all others are identical.
If the length of the string is equal, and the string is not equal, we split the string into a string array and compare the character patterns one by one. When there are different characters, we will record the error points, these error points are what we call the error points. If the error point is smaller than the error point we set after the entire cycle is judged, the error points are equal to this number, otherwise, enter the next cycle. When all the signatures are compared and we do not get the expected results, this point cannot be skipped.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

About verification code recognition 1

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support