Basic ideas and methods of deep learning

Source: Internet
Author: User

Deep Learning, also known as unsupervised feature learning or feature learning, is a hot topic at present.

This article mainly introduces the basic idea and common methods of deep learning.

1. What is deep learning?

In real life, to solve a problem, such as object classification (objects are documents and images), we must first express an object, that is, some features must be extracted to represent an object. For example, in text processing, a word set is often used to represent a document, or a document is represented in a vector space (called a VSM model ), then we can propose different classification algorithms for classification. For example, in image processing, we can use pixel sets to represent an image. Later, we proposed new feature representations, such as sift, this feature is very good in many image processing applications, and the selection of features has a huge impact on the final result. Therefore, selecting features is very important for solving a practical problem.

However, manual feature selection is a very laborious and heuristic method. It depends on experience and luck to a large extent. Since manual feature selection is not good, so can we automatically learn some features? The answer is yes! Deep Learning is used to do this. Looking at its alias unsupervised feature learning, it can be called as the name suggests. unsupervised means that no one is involved in the feature selection process. Therefore, the method for automatically learning features is referred to as deep learning.

Ii. Basic Idea of deep learning

Suppose we have a system s, which has N layers (S1 ,... SN), the input is I, the output is O, and the image is: I => S1 => S2 => ..... => Sn => O. If the output o is equal to the input I, that is, the input I remains unchanged after the system changes, this means that input I passes through each layer of Si without any information loss, that is, at any layer of Si, it is another representation of the original information (that is, input I. Now back to our topic deep learning, we need to learn features automatically. Suppose we have a bunch of input I (such as a bunch of images or texts ), suppose we have designed a system S (with N layers). By adjusting the parameters in the system, the output is still input I, then we can automatically obtain a series of hierarchical features of input I, that is, S1 ,...,
Sn.

In addition, we assume that the output is strictly equal to the input, which is too strict. We Can slightly relax this restriction. For example, we only need to make the difference between the input and the output as small as possible, this relaxation will lead to another type of deep learning method. The above is the basic idea of deep learning.

Iii. Common deep learning methods

A). autoencoder

One of the simplest ways is to take advantage of the features of an artificial neural network. An artificial neural network (ANN) itself is a hierarchical system. If a neural network is given, let's assume that the output is the same as the input, and then train and adjust its parameters to get the weight in each layer. Naturally, we get several different representations of input I (each layer represents a representation). These representations are features, which can be found in the study, if these features are added to the original features, the accuracy can be greatly improved, and the classification problem is even better than the current best classification algorithm! This method is called autoencoder. Of course, we can add some constraints to get a new deep learning method, for example, if the regularity limit of L1 is added on the basis of autoencoder (L1 is mainly used to restrict the nodes in each layer to 0, and only a few are not 0, this is the source of the sparse name .)
Autoencoder method.

B). Sparse Coding

If we relax the limit that the output must be equal to the input, and use the basic concept in linear algebra, that is, O = W1 * B1 + W2 * B2 + .... + wn * bn, Bi is the base, and WI is the coefficient. We can obtain the following optimization problem:

Min | I-o |

By solving this optimization formula, we can obtain the coefficients WI and Bi. These coefficients and foundations are another input approximation expression. Therefore, they can be used to express input I, this process is also automatically learned. If we add the regularity limit of L1 in the above formula, we will get:

Min | I-o | + u * (| W1 | + | W2 | +... + | wn |)

This method is called Sparse Coding.

C) Restrict Boltzmann Machine (RBM)

Assume there is a two-part graph with no links between nodes in each layer. The layer is a visible layer, that is, the input data layer (V), The first layer is the hidden layer (H), If all nodes are binary variable nodes (only 0 or 1 values can be taken), and the full probability distribution P (V, h) Satisfies the Boltzmann distribution. We call this model a restrict Boltzmann Machine (RBM ). Let's take a look at why it is a deep learning method. First, this model is known because it is a binary graph.VIn this case, all hidden nodes are conditional independent, that is, P (H|V)
= P (H1 |V)... P (HN |V). Similarly, in the known Hidden LayerHIn this case, all the visible nodes are conditional independent, and because all V and H meet the Boltzmann distribution, when the inputVWhen using P (H|V) To obtain the hidden layer.HTo obtain the hidden layer.HThen, through P (v | H)
You can also get the visible layer. by adjusting the parameters, we want to make the visible layer obtained from the hidden layerV1With the original visual LayerVIf the same, the hidden layer is another representation of the visible layer. Therefore, the hidden layer can be used as a feature of the input data of the visual layer. Therefore, it is a deep learning method.

If we increase the number of layers in the hidden layer, we can obtain deep Boltzmann Machine (DBM). If we use Bayesian belief networks (digraph models, of course, there is still no link between nodes in the layer), and using restrict Boltzmann Machine in the part that is far from the visible layer, We can get deep belief net (DBN ).

Of course, there are other deep learning methods that will not be described here. In short, deep learning can automatically learn another data representation method. This representation can be used as a feature to join the feature set of the original problem, thus improving the learning method effect, is currently a hot topic in the industry.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.