Paper List about Deep learning

Last Update:2015-07-18 Source: Internet

Author: User

Tags dsn dnn

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Deep learning part of the direction of Paper, for personal use.
a RNN

1 Recurrent neural network based language model

The RNN used in the language model

2 statistical Language Models Based on neural Networks

Mikolov's doctoral dissertation, which focuses his work on the language model of RNN in tandem

3 Extensions of recurrent neural Network Language Model

Continuation of the RNN, some improvements in the network, such as the use of category information to reduce the parameters of the model

4 A Guide to Recurrent neural networks and backpropagation

RNN Network introduction and optimization algorithm is a good article to understand the RNN network

5 Training Recurrent Neural Networks

Ilya Sutskever's doctoral dissertation, RNN network training has always been a difficult point, introducing the training optimization method of RNN network.

6 Strategies for Training Large scale neural Network Language Models

Introduce some trick of training RNN network training language model

7 Recurrent neural Networks for Language understanding

RNN the work of network semantic comprehension

8 Empirical Evaluation and combination of advanced Language Modeling techniques

This paper introduces some experiences of some language model joint techniques, including the work of RNN language model and other model Combinine.

9 Speech recognition with deep recurrent neural Networks

The work of RNN network used in speech recognition

Ten A neural Probabilistic Language Model

Not Rnn,yoshua Bengio The early use of neural networks for the training of language models, but also for the follow-up RNN for language models paved the foundation.

On the diffculty of training recurrent neural Networks

This paper introduces the difficulties of RNN network training, such as vanishing gradient, and some solutions proposed.

Subword Language Modeling with neural Networks

The word-level language model does not adapt to new words because of the oov problem, while the character-level language model can overcome this problem, but the complexity of model training should be improved.

In order to combine the two characteristics, the RNN language model training of sub-word level is proposed, and the model parameters are compressed with K-means.

Performance analysis of neural Networks in combination with N-gram Language Models

On the performance analysis of the combined model of N-gram and neural network language model, the performance will be improved from the point of view of experiment.

Recurrent neural Network based Language Modeling in meeting recognition

Using RNN and N-gram to improve the performance of speech recognition system with revaluation scores

Two DNN

1 A Practical Guide to training restricted Boltzmann machines

Introduce the RBM and the N multi-trick when training the RBM, if you want to implement the RBM algorithm, this article must look

2 A Fast Learning algorithm for deep belief nets

Hinton Classics, deep learning of the mountain, is the beginning of the deep learning eruption

3 A Learning algorithm for Boltzmann machines

85 Older introduction how Boltzmann training algorithm

4 greedy layer-wise Training of deep Networks

can be regarded as Yoshua Bengio to the 06 Hinton work continuation and summary, and 06 article is very complementary, is the entry deep learning essential article

The article also introduces some trick, such as how to deal with the first layer node as the real value of the situation, etc.

5 Large scale distributed deep Networks

Google's Jeffrey Dean group work, the Distbelief framework, mainly introduces how Google uses distributed and model segmentation to deal with deep networks to accelerate its training effectiveness.

6 Context Dependent pretrained deep neural Networks fo Large vocabulary Speech recognition

Microsoft's successful use of speech, the voice recognition system relative error rate dropped more than 20%, is deep learning in the industry's first successful case, its impact sensation.

7 Deep belief Networks for phone recognition

Hinton Group uses DNN for early speech work, which is the foundation of Microsoft's work

8 Application of pretrained deep neural Networks to Large vocabulary Speech recognition

DNN in large vocabulary session speech recognition, there are some voice search and YouTube experiments reported

9 An empirical Study of learning Rates in deep neural Networks for Speech recognition

Some of the tuning experience of Google's DNN-HMM speech recognition system on the learning rate

Ten acoustic Modeling using deep belief Networks

The early work of the Hinton Group on phonetics is mainly about how to apply DNN to acoustic model training

Neural Networks for acoustic Modeling in Speech recognition

Some of the industry giants such as Microsoft, Google and IBM have shared views on DNN's speech recognition

Belief Networks Using discriminative Features for Phone recognition

Hinton Group and IBM work on training DNN networks using some distinguishing features, using LDA to reduce dimensions to 40-D

A Comparison of Deep neural Network Training Methods for Large vocabulary Speech recognition

DNN experimental comparisons, such as the use of different pre-training methods: differentiation pre-training and DBN-generation pre-training mode comparison, and neuron nonlinearity change

Asynchronous Stochastic Gradient desent for DNN Training

Chinese Academy of Sciences, asynchronous GPU parallel training, the idea is basically similar to distbelief, but the hardware replaced by the GPU, the model did not do the segmentation

Improving deep neural Networks for LVCSR using rectified Linear Units and dropout

Enhance DNN-HMM system with relu and dropout technology

Improving the speed of neural networks on CPUs

Google accelerated neural network forward propagation speed, such as the use of fixed-point computing, SIMD technology, etc.

Improved bottleneck Features Using pretrained deep Neural Networks

Related work of Microsoft DNN-HMM system

Improved feature processing for deep neural Networks

Using feature processing technology to enhance the DNN-HMM system, in particular, 13-dimensional MFCC feature splicing 9 frames, lda-mllt transformation, and finally

can also be added to the SAT module to get processed 40-D features, as a DNN-HMM system

Improving neural networks by preventing co-adaptation of feature detectors

This paper mainly describes the dropout technique and its experimental comparison results, and considers dropout as the result of model averaging.

Exploiting sparseness in deep neural Networks fo Large vocabulary Speech recognition

Using soft regularization and convex constraint means to make the DNN model more sparse, the purpose of thinning is

Reduce the complexity of the model, increase the computational speed and the generalization ability of the model

Feature learning in deep neural Networks studies on Speech recognition Tasks

This paper mainly discusses DNN network from the angle of feature learning, discusses why DNN network deeper better, why DNN can learn more lupin features and so on.

Improving neural Networks with dropout

Hinton student Nitish Srivastava's master thesis, mainly discusses the function of Droput technology in neural network.

Learning Features from Music Audio with deep belief Networks

DNN Depth Network in the application of music classification, characterized by MFCC, category for Hiphop, blues and other genre types

Low-rank Matrix factorization for deep neural Network Training with high-dimensional Output Targets

IBM's work, using low rank matrix decomposition technology to solve the problem of DNN classification layer weight parameter too much

Multilingual Training of deep neural Networks

DNN Multi-language applications, tuning when only the classification layer parameters can be

A cluster-based multiple deep neural Networks Method for Large vocabulay continuous Speech recognition

By using category information to train the data, the small model information trained by all data is integrated into the Bayesian framework, which accelerates the whole training process, but the accuracy can be lost and decoded

It slows down too.

Restructuring of deep neural Network acoustic Models with Singular Value

This paper proposes using SVD technique to compress the weight matrix and reduce the complexity of the model.

Sparse Feature Learning for deep belief Networks

Marc ' Aurelio Ranzato proposes a way of unsupervised feature learning, which has the advantage of low-dimensional and sparse features,

In this paper, the RBM and PCA methods are compared.

Training products of experts by minimizing contrastive

Hinton proposed the Poe model, the article discusses how to train the Poe model, the RBM model is also a special Poe model, RBM training has evolved from this, if

To understand the principle of CD algorithm, this article must read.

Understanding how deep belief Networks Perform Acoustic modelling

This paper mainly discusses the reasons why the DBN model can achieve better system performance in acoustic model training, but there is no theoretical support.

pipelined back-propagation for context-dependent deep neural Networks

Using multi-GPU technology to pipelined the network in parallel, some parallel measures, such as data parallelization and model Parallelization, are also mentioned in this paper.

Recent advances in deep learning for Speech, in Microsoft

This paper mainly introduces the progress of Microsoft's work in deep learning, such as regression primitive feature, multi-task feature learning, adaptive DNN model and so on.

Rectified Linear Units Improve Restricted Boltzmann Machines

This paper introduces the application of Relu technology in RBM model, that is, the substitution of nonlinear layer.

Reducing the dimensionality of Data with neural Networks

Hinton published in the Science article, mainly introduces how to use the neural network for nonlinear dimensionality reduction, the paper compares the PCA linear dimensionality reduction Technology

Data normalization in the learning of Restricted Boltzmann machines

The trick of data processing in RBM training makes RBM training more Lupin by the 0 mean value processing.

connectionist probability estimators in HMM Speech recognition

The method of early neural network used in acoustic model training is the foundation of DNN-HMM work now.

Learning for robust Feature Generation in audio-visual Emotion recognition

Deep learning in the application of emotion analysis in audiovisual system, the hybrid training model of multiple visual signals and auditory signals is presented in this paper.

Panax Notoginseng improving Training time of deep belief Networks Through Hybrid pre-training and Larger Batch Sizes

The combination of the pre-training and the differentiated pre-training is adopted, and the size of minbatch can increase the granularity of data parallelism.

Training Restricted Boltzmann machines using approximations to the likelihood Gradient

A new algorithm for training RBM PCD, unlike the CD algorithm is only one Markov chain, parameter update without restarting a new Markov chain, of course, a

Assuming that the parameter is updated, the model change is not very large, the text also mentions the use of small learning rate.

Classification using discriminative Restricted Boltzmann machines

Differentiated DRBM is proposed, compared to the generated model RBM optimization is the P (x, y) function, the Differentiated DRBM optimization is the P (y|x) function, and here y is the label, the text also proposed a hybrid version.

Learning multiple Layers of Features from Tiny Images

Hinton student Alex Krizhevsky's Master's thesis, mainly DNN work of some tandem

Making deep belief Networks effective for Large vocabulary continuous Speech recognition

Discuss how to effectively train DNN, focusing on how to train in parallel

Optimization techniques to Improve Training speed of deep neural Networks for Large Speech Tasks

IBM's Tara N Sainath team DNN Some tips on how to improve parallelism and reduce model parameters, and IBM mainly uses low-rank matrix decomposition for classification layers.

Although CNN is DNN's evolutionary version, the number of parameters is relatively small, but at present, the best CNN effect in speech recognition is similar to the DNN effect with the same number of parameters.

Parallel Training of neural Networks for Speech recognition

The work of neural network parallel training is mainly divided into two parts: multithreading multi-core parallelization and SIMD-based GPU parallelization.

Accurate and Compact Large vocabulary Speech recognition on Mobile Devices

Google's practical work on mobile speech recognition, especially DNN and LM optimizations, DNN's optimizations include fixed-point computing, SIMD acceleration, Batch lazy computing, and frame skipping technology

The language model also does some compression skill. Refer to practical articles of great value.

Cross-language knowledge Transfer Using multilingual deep neural Network with Shared Hidden Layers

DNN Multi-language training, all languages share the same hidden layer features, and the classification level to different languages, this training reduces the 3-5% around, the reason is somewhat similar to transfer learning,

The knowledge between different languages can be transfer for reference.

Improving wideband Speech recognition using Mixed-bandwidth Training Data in CD-DNN-HMM

Using 8-khz and 16-khz to do different frequency bands of CD-DNN-HMM hybrid training, which is more important is how to design different frequency bands Filter-bank alignment problem,

There are also some training techniques for filter-bank, such as whether to use dynamic characteristics and static characteristics training.

Robust Visual recognition Using multilayer generative neural Networks

Hinton student Yichuan Tang's master's thesis, DNN series of work on visual recognition

Deep Boltzmann Machines

The DBM model begins with the article.

Rectified Linear Units for Speech processing

Performance analysis of Relu in speech recognition

Three CNN

1 deep convolutional Network Cascade-Facial Point Detection

CNN uses a face-critical detection job.

2 applying convolutional neural Networks concepts to Hybrid nn-hmm Model for Speech recognition

CNN used in speech recognition system

3 ImageNet classification with deep convolutional neural Networks

12 Hinton Group in the Imagenet contest of the CNN algorithm, but the details are not much, inside the network introduced the use of trick, especially Relu

4 gradient-based Learning applied to Document recognition

Yann LeCun's classic article, CNN, to understand that CNN must first read this

5 A theoretical analysis of Feature Pooling in Visual recognition

The principle analysis of pooling in visual recognition and the summary of some similar methods such as hog and sift in visual recognition

6 What's the best multi-stage Architecture for Object recognition

This paper discusses how to design multi-level structure to obtain better recognition performance on the or problem, and discuss the problem of model architecture, such as how to structure

Get the invariant of the feature, how to go to the joint level of information, do visual should take a good look at this article

7 Deep convolutional neural Networks for LVCSR

CNN is actually using it on LVCSR.

8 Learning mid-level Features for recognition

This paper should look at the analysis of the current visual recognition framework and the relationship between the framework parts, such as coding and pooling technology.

9 convolutional Networks and applications in Vision

Convolutional networks in the visual application of the analysis, do visual should look. The thought of layering is a good internal expression in visual application. In this paper, the convolution network is split into

The Filter bank layer, nonlinear layer and pooling layer are analyzed.

Ten convolutional neural Networks applied to house Numbers Digit classification

Convolution network used in the housing digital classification case, the paper uses the LP pooling technology, through the Gaussian kernel to produce increased stronger feature weight, suppress weaker feature weight effect.

Visualizing and understanding convolutional Networks

Convolution network feature visualization work, very meaningful work, through the deconvnet way to visualize the characteristics of convolutional network layer, with these features can help us to adjust the model.

Stochastic Pooling for regularization of deep convolutional neural Networks

The stochastic pooling technique is proposed, which is different from the form of Max pooling and average pooling,pooling, which is randomly selected,

The paper argues that the stochastic pooling technique is similar to dropout, which is equivalent to the input image through adding noise to form many different copy training samples through the max pooling layer, effectively preventing overfitting

Adaptive deconvolutional Networks for Mid and high level Feature learning

The non-supervised learning method of middle and high level features is reconstructed by deconvolution way to learn the image features.

Best practices for convolutional neural Networks applied to Visual Document analysis

Practical convolutional network work, the article mentions how to deal with training data less than the method can be referenced.

Multi-column deep neural Networks for Image classification

Combine multiple deep network models to do averaging processing.

Differentiable Pooling for hierarchical Feature learning

A differentiable pooling based on Gauss method is presented, reading this article first to read 13 articles, compared to max pooling, average pooling in the use

Refactoring in deconvolution mode has some advantages.

+ Notes on convolutional neural Networks

A more detailed convolutional neural network, including the calculation of gradients and so on.

Fast inference in Sparse Coding algorithms with applications to Object recognition

Unsupervised learning algorithm PSD, on the basis of the sparse coding framework, adds the limitation of the sparse base near the sparse coding by the nonlinear transformed base.

When the objective function is optimized, some parameters are fixed first, and the thought is somewhat similar to the coordinate gradient descent algorithm.

Neural Networks for Object Detection

Google uses DNN-based (actually CNN) regression to do object Detection, first to precipitate mask, and then pinpoint.

Multi-gpu Training of Convnets

Some engineering techniques of multi-GPU parallel training convolutional networks

Flexible, high performance convolutional neural Networks for Image classification

CNN uses a real-story GPU-trained article, which is an early article.

Multi-digit number recognition from Street View Imagery using deep convolutional neural Networks

The digital image recognition of Google Street View is transformed into sequential sequence recognition problem by CNN, and the traditional OCR digital recognition is usually divided into

And here as a whole sequence of identification, the paper also reported the proposed model in a variety of data sets of the recognition rate. The training framework is also based on Google's distbelief framework.

Four other

1 An Introduction to deep learning

Deep Learning Summary of the brief, relatively short, the text simply refers to some commonly used deep learning model

2 The difficulty of Training deep architectures and the Effect of unsupervised pre-training

This paper mainly discusses the difficulties of depth structure training, analyzes the advantages of pre-training from the point of view of experimental data, and discusses the behavior of pre-training in this paper.

Similar to the regularization weights matrix.

3 Why Does unsupervised pre-training help deep learning

This paper discusses several aspects of non-supervised learning to help deep learning, and puts forward the viewpoint of pre-training as a regularizer, and analyzes from the experimental data,

There is no theoretical basis, this is also the deep learning is the most criticized at this stage, there is no complete theoretical system support.

4 Learning deep architectures for AI

Yoshua Bengio in deep learning review article, want to know about deep learning field can first look at this, can sweep look.

5 Representation Learning A Review and New perspectives

Yoshua Bengio's summary article on representation learning.

6 on optimization Methods for deep learning

Several optimization methods of deep learning are discussed in this paper: SGD, L-bfgs, CG. Experiments on the advantages and disadvantages of several optimization methods.

7 Using Very Deep autoencoders for content-based Image retrieval

The image global feature is characterized by the autoencoder of the middle node, which is used for image searching.

8 Deep Learning for Signal and information processing

2013 Dragon Star Machine learning Li Deng Lecture material, mainly focus on deep learning in the voice aspect, more detailed.

9 on the importance of initialization and Momentum in deep learning

This paper introduces the importance of initialization and momentum technology in deep learning, and more on experimental analysis.

Ten dropout Training as Adaptive regularization

This paper analyzes dropout technology from principle, which is equivalent to adaptive regularization technology.

One deep learning via Hessian-free optimization

At present, most deep learning optimization is based on stochastic gradient optimization algorithm, and a second order optimization algorithm based on Hessian-free is proposed in this paper.

Stacking Networks For information retrival

Work on information retrieval for DSN networks

Deep convex net:a Scalable Architecture for Speech Pattern classification

The model that Microsoft has designed to overcome the difficulty of DNN parallelism has great advantages in computational scalability.

Parallel Training of deep stacking Networks

DSN training parallelization

Scalable calable Stacking and learning for Building deep architectures

DSN related articles, the relevant several can be combined together to see

Paper List about Deep learning

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More