A shallow understanding on Deep Learning

Source: Internet
Author: User

The recent deep learning fire not only attracted the attention of the academic community, but also sought after in the industry. In many important evaluations, DL has achieved the effect of state of the art. Especially in terms of speech recognition, DL has reduced the error rate by about 30% and has made significant progress. If the company that uses speech recognition does not use DL, I am sorry to say hello, it is believed that this situation will extend to other fields such as image processing and natural language processing.

Deep Learning is a branch of machine learning and can be understood as the development of neural network. About 20 or 30 years ago, neural network was a particularly popular direction in the ml field, but it gradually fades out, for the following reasons:
1. It is easier to train and the parameters are more difficult to tune;
2. The training speed is relatively slow, and the effect is not better than other methods in the case of fewer layers (less than or equal to 3;
Therefore, there were around 20 years in the middle, and neural networks were rarely concerned. This time was basically the world of SVM and boosting algorithms. However, an infatuated old Mr. Hinton persisted and eventually (together with others, bengio, Yann. lecun, etc.) developed a practical deep learning framework.

Deep Learning differs from traditional neural networks in many ways.
The two are the same because deep learning uses a layered structure similar to a neural network. The system consists of a multi-layer network consisting of an input layer, a hidden layer (multi-layer), and an output layer. Only nodes on the adjacent layer are connected, nodes on the same layer and across layers are not connected to each other. Each layer can be considered asLogistic Regression ModelThis layered structure is close to the structure of the human brain.
To overcome problems in neural network training, DL adoptsDifferent training mechanisms. In traditional neural networks, back propagation is used. In short, iterative algorithms are used to train the entire network, and initial values are randomly set to calculate the output of the current network, then, the parameters of the previous layers are changed based on the difference between the current output and the label until convergence (the whole is a gradient descent method ). Deep Learning is a layer-wise training mechanism. The reason for this is that, if the back propagation mechanism is adopted, for a deep Network (more than 7 layers), the residual propagation to the front layer has become too small.Gradient
Diffusion (gradient diffusion)
.

The deep learning training process is as follows:
1. Use non-calibration data (or calibration data) to train parameters at different layers. This step can be considered asUnsupervised training processIs the biggest difference from traditional neural networks (this process can be viewedFeature learningProcess ):
Specifically, first train the first layer with no calibration data. You can useAuto-encoder to learn the parameters at the first layer(This layer can be seen as a hidden layer that minimizes the differences between output and input.) due to capacity constraints and sparse constraints, this allows the model to learn the structure of the data and obtain features that are more expressive than the input. after learning the n-1 layer, take the output of N-1 layer as the input of N layer, train N layer, and obtain parameters of each layer respectively;
Important understandingAuto-EncoderAndSparse.
2. further fine-tune the parameters of the entire multi-layer model based on the parameters obtained in the first step. This step is a supervised training process. The first step is similar to the random initialization initial value process of a neural network, since the first step of DL is not random initialization, it is obtained by learning the structure of the input data, so this initial value is closer to the global optimal, so as to achieve better results; therefore, the effect of deep learning is largely attributed to the first step.Feature learning process.

In short,Deep Learning can better represent the feature of data. Due to the many layers and parameters of the model, capacity is sufficient. Therefore, the model has the ability to represent large-scale data, therefore, the features of images and speech are not obvious (manual design is required and many problems do not have an intuitive physical meaning), so they can achieve better results in large-scale training data.. In addition, from the perspective of Pattern Recognition features and classifiers, the deep learning framework combines feature and classifier into a framework and uses data to learn feature, this reduces the workload of manually designing feature (which is the most effort by engineers in the industry). Therefore, not only can the effect be better, but it also has a lot of convenience, therefore, it is a set of frameworks worth attention. Everyone who is doing ml should pay attention to it.

Of course, deep learning itself is not perfect, nor a powerful tool to solve any ml problems in the world. It should not be magnified to an omnipotent level.

Therefore, make a deep understanding on deep learning to use it for me.

Recommended learning materials:
1. Stanford deep learning Tutorial: http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial;
2. Summary: bengio's learning deep ubuntures for AI,

Http://www.iro.umontreal.ca /~ Bengioy/papers/ftml_book.pdf;

3, Andrew Ng's talk video:

Http://techtalks. TV /talks/machine-learning-and-ai-via-brain-simulations/57862 /;

4, cvpr 2012 Tutorial:
Http://cs.nyu.edu /~ Fergus/tutorials/deep_learning_cvpr12/tutorial_p2_nnets_ranzato_short.pdf .pdf

Introduction From LR, Neoral network, convolutional neural network to sparse encoder;

 

Original article:Http://blog.sina.com.cn/s/blog_6ae183910101dw2z.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.