How to use deep learning to crack verification code keras continuous Verification Code

Source: Internet
Author: User
Tags keras

In the process of implementing web crawler, the appearance of verification code will always hinder the work of crawlers. This issue introduces an end-to-end verification code recognition method using deep neural networks. By using this method, the recognition result with the accuracy exceeding 90% can be achieved without cutting the picture and matching the template.

This paper is divided into two parts, the first part describes how to use deep neural network to implement verification code training and recognition, the second part of the implementation process to overcome the engineering problems.

I. Verification code recognition based on deep neural network

Verification code recognition is the process from the picture to the text. Traditional algorithms such as OCR are designed to solve this kind of problem. However, in the real world, the captcha usually does not appear in the regular text, that is, the text usually has varying degrees of distortion, the image itself is usually added more or less noise. The emergence of these disturbances, so that the text segmentation, template matching is no longer effective, and then the OCR algorithm is difficult to parse the results.

In recent years, The Deep Neural Network (DNN) has been proved to be a powerful recognition capability in the field of image recognition. The identification of single text is a typical classification problem. The usual practice is to train a deep neural network, the last layer of the network is divided into n categories, representing the number of characters. For example, for the English alphabet, the last layer of the classifier is 26. For example, the classic lenet (http://yann.lecun.com/exdb/lenet/) is a network that solves a single word recognition:

However, the verification code usually contains multiple characters, how to use the existing network to achieve such classification problem? In fact, this problem is called multi-label training problem in machine learning. Each of these image inputs corresponds to only one label category, and the output of this category is multiple labels. We can also change the traditional neural network slightly to adapt to this situation.

Let's take the simplest example of the English alphabet to introduce this process. As shown, this type of verification code consists of 5 letters, each letter capitalized, a total of 26 categories, the picture has interference lines through the text, making the text segmentation more difficult.

Figure I. Verification Code Instance

Next, we design a two convolutional neural network:

Figure Two. convolutional Neural Networks

The network in Figure II and the general CNN network are nothing special, the front is the convolution, the pooling layer, only the last in the classification, the expansion of 26 categories to 26*5=130 categories. For each picture's label, in this 130-dimensional vector, there is a 1 in each of the 26 dimensions, the remaining 0, and five letters encoded. Then the cross entropy is used as the cost function to optimize the network. In this way, as long as the original classification of the network to make a simple change can solve the identification problem of verification code.

As for the mathematical expression type of the verification code, our network only on the code to the 26 classification problem into a 13 classification problem. In the following example (see figure III) The last layer of the classifier is designed to 3*13=39 categories.

Might. Identification of mathematical type verification code

According to this idea we cracked a lot of style verification code, four shows:

Figure Four. Example of different style verification code cracking

Two. Some practical engineering issues that need to be addressed

(1) Synthetic training data

The premise of training is that we already have a lot of training data, and it's hard to get enough of the trained data when we identify the code. So we have to synthesize the training data manually. This section can usually be done by invoking a Java or C # text rendering library.

Training data is not the more the better, the main problem is that the synthetic data and real verification code in the form of some gaps, we are difficult to synthesize the same results. Font, font size, and degree of distortion are more or less different from real data, and this difference can directly lead to the training of the network in the face of real data can not play a role.

Our experience is that, in the case of real data, where training data cannot be synthesized in a similar way, the diversity of the samples is actually followed by the idea of data enhancement in deep learning (augmentation). Five, the left is the real data, we in the composition of the data deliberately increased the rotation of each text, translation, increased noise, so that the training of the network can cope with the data enough to change, so as to identify the real example of the left image. Otherwise, even with high precision on synthetic data, it is still possible to have very low accuracy on real data (i.e. over-fitting on synthetic data).

Figure Five. Synthetic data instances

(2) Selection of network size

For different tasks, the impact of network size on the results is also enormous. Not all tasks have to be trained using a deep network. Theoretically, the deeper the network freedom is, the more easily it is to fit. Although there are weight_decay such parameters can be a certain degree of confrontation over the fitting, but usually the difficulty is still very large. So in general, for the less complex verification code should choose a smaller network, only to encounter more complex verification code such as Chinese idioms, our experience is a complex network under the effect is better.

In short, captcha recognition can be learned as a practiced hand project for deep learning, and it is easier to understand many of the concepts in deep learning theory in this practical project.

Reproduced in: http://www.saluzi.com/t/topic/16027

How to use deep learning to crack verification code keras continuous Verification Code

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.