1. In the paper [1], the idea of dnn is used for alignment. The training data required for alignment comes from HMM and IBM model4. The input layer consists of four layers. For details, see:
The effect is better than that of the original hmm And ibm4. For details, see:
This idea can be used in many places, such as segment similarity, sentence similarity, translation probability, word vector, and other calculations. However, in practice, dnn cannot beat other methods completely. In principle, it is similar to constructing a machine learning model that contains contextual features, of course, some people may say that word vectors imply more features in them, but from the perspective of actual use and observation by bloggers, word vectors still have limited functions, the problem may be that, in terms of the rich meaning and flexible combination of languages, language is a living thing, while mathematical representation is dead. For example, a word vector is equivalent to a word that is regarded as a fixed industrial part. It has its own specification parameters and is clearly displayed at what position, but unfortunately, language is not that simple. A word is more like a liquid metal. It not only has the current shape and size, but can also be combined with other metal blocks, the formation of a new shape is given a new way of use. For example, the word "big" has a meaning of "big", but if I say big is very high, it means "forced, A fixed dimension cannot represent a living word. To put it bluntly, words are active and vectors are dead. This is why I think word vectors are useful but not practical. I have some immature ideas that need to be verified. 2. knowledge points:
To be supplemented...
3. Approximate code implementation (for python, use IBM model1 + nn ):
Follow-up supplements...
Reference:
[1] ACL '13, Word Alignment modeling with context dependent deep Neural Network
Reprinted, please indicate the reference from:
Http://www.cnblogs.com/breakthings/p/4049854.html
Dnn deep Neural Network alignment