Complex Background verification code identification and cracking. Take discuz's animated Verification Code as an example.

Source: Internet
Author: User

Complicated verification codes, such as the latest verification codes on the DZ forum, are difficult to handle, but the principle is the same as that of common recognition. There is only one more background processing solution, let's take a look at the following ideas for identifying DZ Forum verification codes.

// Power by www.crazycoder.cn

First, we need to remove its background. For such a slightly complex background, it is difficult to use the previous method. The example is not very obvious. I found that many pictures have similar background colors and letter colors, in addition, the letter color is constantly changing, and the background is constantly changing.

My initial idea was to find the method with the most colors in the image, so we used HSL to represent the colors of each point, and then we made statistics to get the largest peaks, here is the L value of several of the most abundant colors in the image.

The rest can be considered as noise. We separate each peak value to obtain a video.

In this way, you can split a single color image, and then find the image with black and white removed from the image.

Then perform the gray processing, threshold processing, and noise reduction to obtain

Then sort the alphabetic order based on the leftmost X position detected by the boundary.

The next step is to turn the image into a standard template and use a small amount of learning to achieve a recognition rate of more than 95%.

C: 15 J: 8 T: 9 x: 7 h: 7 F: 8 E: 18 B: 5 Y: 3 K: 4 W: 3G: 5 R: 2 M: 3 Q: 4 V: 2 P: 3
The above data indicates that C has learned 15 times. J has learned 8 times...

As long as the characters do not stick, most verification code interference technologies can be used, so why Google verification code looks very simple, but no one can crack it well.

Supplement,
Rise found some characters in the message to be added to the miscellaneous problem, because this verification code is not very common, I did a little research

Cy3e is a three-character photo with no noise. according to the method described in the article, how do you know that this 3 is not a picture of the same color as other color dots?

I think we need to add a step to fill in the image generated by each color filter.
Find the source image of 3:

Then we fill in the algorithm

To filter the differences between this image and other images that are all at the same time, you can use the following methods:
1. coherent point width
2. Number of coherent points
In this way, only the filtered images of cy3e are left.

As for the character skew problem, I think we can completely rotate the picture we are learning from a certain angle in the machine learning process, for example, from-10 to + 10 degrees, however, such a Learning Library will be larger, but for 10-digit verification codes, this performance loss should be negligible.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.