Machine learning--DBN Depth Belief network detailed

Last Update:2017-01-12 Source: Internet

Author: User

Tags svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Deep neural network has achieved unprecedented success in the fields of speech recognition, image recognition and so on. I have been exposed to neural networks many years ago. This series of articles mainly records some of the learning experiences of deep neural networks.

The deep neural network model is briefly described.

1. Self-associative neural networks and deep networks

The self-associative neural network is a very old neural network model, which simply says that it is a three-layer BP network, except that its output equals input. Many times we do not require the output to be exactly equal to the input, but allow a certain amount of error to exist. So, we say that the output is a refactoring of the input. Its network structure can be very simple to represent the following:

If we do not use the sigmoid function in the above network and use linear functions, this is the PCA model. The number of intermediate network nodes is the number of principal components in PCA model. Do not worry that the learning algorithm will converge to the local optimal, because the linear BP network has a unique minimum value.

In the term of deep learning, the above structures are called self-coded neural networks. From a historical point of view, self-coding neural network is a matter of decades ago, there is no novel place.

Since the self-associative neural network can reconstruct the input data, if the network structure has been trained, then its middle layer can be seen as a kind of characteristic representation of the raw input data. If we remove it from the third layer, this is a two-layer network. If we take this learning feature and create a self-associative three-layer BP network in the same way, as shown in. In other words, the input of the three-layer self-associative network created for the second time is the output of the middle tier of the previous network. Using the same training algorithm, the second self-associative network is studied. Then, the middle layer of the second self-associative network is a characteristic representation of its input. If we follow this approach, we create many of these network structures, which are made up of self-associative networks, which are deep neural networks, as shown in:

Note that the last layer of the deep network in which it is composed is a softmax classifier cascade.

Deep neural networks at each level are the granularity of the most primitive input data in different concepts, that is, the description of different levels of features.

This method of cascading multiple self-associative networks was first thought of by Hinton.

From the above description, it can be seen that the depth of the network is layered training, including the last layer of the classifier is also trained separately, the last layer of the classifier can be replaced by any kind of classifier, such as SVM,HMM. Each of the above layers is trained separately using the BP algorithm. Believe this idea, Hinton has already experimented.

2. DBN Neural Network model

Using the BP algorithm to train each layer individually, we found that the third layer of the network must be discarded in order to cascade the self-associative neural network. However, there is a better neural network model, which is the restricted Boltzmann machine. The method of using Cascade Boltzmann machines to form deep neural networks is called deep belief network DBN in deep learning, which is a very popular method at present. In the following terms, the self-associative network is called the Self-coding network Autoencoder. By cascading the deep network of self-coded networks in deep learning another one belongs to the called Stack Self-coding network.

The classic DBN network structure is a deep neural network composed of several layers of RBM and a layer of BP, as shown in the structure.

DBN is mainly divided into two steps in the course of training the model:
The 1th step: to train each layer of RBM network separately without supervision, to ensure that the feature vectors are kept as much as possible when they are mapped to different feature spaces;
2nd step: Set up BP network at the last layer of the DBN, receive the output eigenvector of the RBM as its input eigenvector, and train the entity relationship classifier supervised. Moreover, each layer of RBM network can only ensure that the weights within the layer are optimized for the eigenvector mapping of the layer, not the whole DBN eigenvector mapping. To the optimal, so the reverse propagation network also spreads the error message from top to bottom to each layer of RBM, fine tune the entire DBN network. The process of the RBM network training model can be regarded as the initialization of a deep BP network weight parameter, which makes the DBN overcome the disadvantage that the BP network is prone to local optimization and long training time due to the random initialization of weight parameters.

The first step in the above training model in deep learning is called pre-training, and the second step is called fine-tuning. The top has supervised learning that layer, according to the specific application field can be replaced by any classifier model, rather than the BP network.

3. Application of Depth belief network

Since the self-coding network can abstract the original data on the granularity of different concepts, a natural application of the deep network is to compress the data or call the dimensionality reduction.

Hu Shaohua, they used a self-coding network to reconstruct the classic "Swiss volume" Data:

"Swiss Roll" data is one of the most difficult data to classify in the classical machine learning, and its implicit data intrinsic pattern is difficult to describe in the two-dimensional data. However, Hu Shaohua and so on, using depth belief network to realize the 2-dimensional representation of three-dimensional Swiss volume data, its self-coding network node size is 3-100-50-25-10-2. For specific implementation details, please refer to the literature: Hu Shaohua, Song Yaoliang: Data dimensionality reduction and reconstruction based on Autoencoder network.

Another common application of deep neural networks is feature extraction.

Literature: Philippe Hamel and Douglas Eck, learning FEATURES from MUSIC AUDIO with deep belief NETWORKS.

By training a 5-layer depth network to extract the characteristics of music, for the classification of musical style, its classification accuracy is compared to the method based on the characteristic classification of Mel Cepstrum coefficients 14%.

Their implementation is very simple, with the above cascade of multiple RBM networks to form a deep network structure to extract the characteristics of music. The input raw data is the spectrum of the signal after the frame, plus the window. The classifier adopts SVM of support vector machine. The comparison method is to extract the MFCC feature coefficients, and the classifier also uses SVM. More details and experimental results can be found in the literature mentioned above.

Deep Network is a good unsupervised learning method, and its feature extraction function can be widely used in many fields for the granularity of different concepts. Typically, DBN is primarily used for modeling one-dimensional data, such as speech. The model of deep network composed of cascaded multilayer convolutional networks is mainly used for two-dimensional data, such as example.

through the following diagram and above, you can get a deeper understanding of DBN: Depth belief network algorithm.

References:
[1]hinton G E, Salakhutdinov R. Reducing the dimensionality of data with neural networks. Science, vol. 313, pp. 504-507, 2006.
[2]hinton G E, Osindero S, Teh Y W. A Fast Learning algorithm for deep belief nets. Neural Computation, vol. 1527-1554, pp. 2006.
[3]xie, Jipeng, et al. "Learning features from the high speed Train vibration signals with the deep belief Networks." neural Networks (IJCNN), International Joint Conference on. IEEE, 2014.
[4]bengio Y, Lamblin P, Popovici D, et al. greedy layer-wise training of deep networks. Advances in neural Information Processing Systems, vol, pp. 153-160, 2007.
[5]salakhutdinov R. Learning deep generative models. Diss. University of Toronto, 2009.
[6]hinton G. A Practical Guide to training restricted Boltzmann machines. Neural networks:tricks of the Trade, pp. 599-619, 2012.
[7]bengio Y. Learning deep architectures for AI. Foundations and trends? In Machine Learning, Vol. 2, pp. 1-127.

[8]http://blog.csdn.net/celerychen2009/article/details/9079715

Machine learning--DBN Depth Belief network details (turn)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More