This article is the paper ' Chinese poetry Generation with recurrent neural Network ' reading notes, this paper was published in EMNLP in 2014.
ABSTRACT
This paper presents a model of Chinese classical poetry generation based on RNN.
Proposed METHOD
Generation of the first sentence
The first sentence is generated in regular style.
Customize several keywords first, then expand more related phrases through the Poetics of English (which is written by the Qing people). Then generate all the sentences that meet the formatting constraints (mostly tonal), and then use a language model to order them to find the best.
There is a very important sentence in the original text, but I do not understand.
In Implementation,we employ a character-based recurrent neural network language model (Mikolov et al., interpolated) With a Kneser-ney trigram and find the N-best candidates with a stack decoder.
The next sentence is generated
A word-by-word generation.
already has the previous 1,2,3..i sentence, the sentence \ (s_{i+1}\) :
\[p (S _{i+1}| S_{1:I}) = \prod_{j=1}^{m-1}p (w_{j+1}|w_{1:j},s_{1:i}) \]
is the product of the probability of each word that makes up the sentence. The probability of each word depends on the previous j-1 word and the preceding I statement.
The entire model consists of three sub-models:
1, CSM, convolution sentence model
The task of this model is to \ (s_i\) map to a vector \ (v_i\)
\[v_i = CSM (s_i) \] The
is based on the CNN senence model, Kalchbrenner and Blunsom,
2, RCM model, recurrent context models
This model maps the I vectors generated by the CSM to the front I sentence to \ (u_{i}^{j}\)
\[u_i^{ J}=RCM (v_{1:i},j) \]
This is a encode-decode model, the first I sentence encode to a vector, and then decode to the M vector, each vector corresponding to a position, If it is five words quatrains that is decode out 5 vectors, respectively, corresponding to 5 words. Then we'll stitch together these vectors.
3,RGM models, recurrent generation model
The probability that the next output is a word w is predicted (W is any word in the dictionary), with the output of the RCM, the first J-word, the word itself (the model uses the One-hot encoding vector) as input.
\[p (w_{j+1}|w_{1:j},s_{1:i}) = RGM (w_{1:j+1},u_i^{j}) \]
This is actually a language model.
\ (e (w_j) \) is the one-hot encoding of the word W. Y need to be aware that "matrix \ (Y\subset r^{| V|\times q}\) decodes the hidden representation to weights for all words in the vocabulary "
Contribution
This article has two innovative points:
- With the RNN model, the constraint of the format and the selection of the content are done together.
- All of the previously generated sentences are considered in the build process.
RESULT
Enjoy the whole model-generated poetry.
Paper "Chinese poetry Generation with recurrent neural Network" reading notes