[Zz] discuz Forum Verification Code cracking

Source: Internet
Author: User

Original article address:

Http://www.seoo.org/2008/06/09/discuznet's GIF animation Verification Code cracking success. html

Http://www.seoo.org/2008/07/06/?background Verification Code cracking .html

 

GIF animation Verification Code cracking

1. Analyze GIF animation to get the total number of frames and related information of each frame.

2. Retrieve the frame with the longest latency

3. Use the color of each pixel in the first line to remove the background (to limit the removal range, otherwise the text may be removed)

4. Use closing to reduce noise and handle the threshold to obtain a neat black/white verification code.

5. Use spaces between characters to separate characters

6. Extract sample features for Machine Learning

7. In the case of 200 samples, the recognition rate can reach more than 80%. If you continue learning, the recognition rate can be higher.

 

 

 

Complex Background Verification Code cracking

First, we need to remove its background. For such a slightly complex background, it is difficult to use the previous method. The example is not very obvious. I found that many pictures have similar background colors and letter colors, in addition, the letter color is constantly changing, and the background is constantly changing.

My initial idea was to find the method with the most colors in the image, so we used HSL to represent the colors of each point, and then we made statistics to get the largest peaks, here is the L value of several of the most abundant colors in the image.

The rest can be considered as noise. We separate each peak value to obtain a video.

In this way, you can split a single color image, and then find the image with black and white removed from the image.

Then perform the gray processing, threshold processing, and noise reduction to obtain

Then sort the alphabetic order based on the leftmost X position detected by the boundary.

The next step is to turn the image into a standard template and use a small amount of learning to achieve a recognition rate of more than 95%.

C: 15 J: 8 T: 9 x: 7 h: 7 F: 8 E: 18 B: 5 Y: 3 K: 4 W: 3G: 5 R: 2 M: 3 Q: 4 V: 2 P: 3
The above data indicates that C has learned 15 times. J has learned 8 times...

As long as the characters do not stick, most verification code interference technologies can be used, so why Google verification code looks very simple, but no one can crack it well.

Supplement,
Rise found some characters in the message to be added to the miscellaneous problem, because this verification code is not very common, I did a little research

Cy3e is a three-character photo with no noise. according to the method described in the article, how do you know that this 3 is not a picture of the same color as other color dots?

I think we need to add a step to fill in the image generated by each color filter.
Find the source image of 3:

Then we fill in the algorithm

To filter the differences between this image and other images that are all at the same time, you can use the following methods:
1. coherent point width
2. Number of coherent points
In this way, only the filtered images of cy3e are left.

As for the character skew problem, I think we can completely rotate the picture we are learning from a certain angle in the machine learning process, for example, from-10 to + 10 degrees, however, such a Learning Library will be larger, but for 10-digit verification codes, this performance loss should be negligible.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.