The best way to learn TensorFlow is to read the official document: https://www.tensorflow.org/versions/r0.12/tutorials/seq2seq/
First, TensorFlow of the RNN use:
1. Using lstm
Lstm = Rnn_cell. Basiclstmcell (Lstm_size)
# Initial State of the LSTM memory.
state = Tf.zeros ([Batch_size, Lstm.state_size])
probabilities = []
Loss = 0.0
For Current_batch_of_words in Words_in_dataset:
# The value of state was updated after processing each batch of words.
Output, state = Lstm (Current_batch_of_words, State)
# The LSTM output can used to make next word predictions
Logits = Tf.matmul (output, softmax_w) + Softmax_b
Probabilities.append (Tf.nn.softmax (logits))
Loss + = Loss_function (probabilities, target_words)
2. Increase the number of layers in reverse propagation
# Placeholder for the inputs in a given iteration.num_steps can be seen as the number of words in a sentence
Words = Tf.placeholder (Tf.int32, [Batch_size, Num_steps])
Lstm = Rnn_cell. Basiclstmcell (Lstm_size)
# Initial State of the LSTM memory.
Initial_state = state = Tf.zeros ([Batch_size, Lstm.state_size])
For I in Range (num_steps):
# The value of state was updated after processing each batch of words.
Output, state = Lstm (words[:, I], state)
# The rest of the code.
# ...
Final_state = State
3. Input data
# before formally calling Lstm, a word2vec is required to encode the word, Embedding_matrix is a tensor of shape [vocabulary_size embedding size]word_ids The index number of the word
Word_embeddings = Tf.nn.embedding_lookup (Embedding_matrix, Word_ids)
4. Building a multilayer lstm,number_of_layers is the depth of the lstm built
Lstm = Rnn_cell. Basiclstmcell (Lstm_size, State_is_tuple=false)
Stacked_lstm = Rnn_cell. Multirnncell ([lstm] * number_of_layers,
State_is_tuple=false)
Initial_state = state = Stacked_lstm.zero_state (Batch_size, Tf.float32)
For I in Range (num_steps):
# The value of state was updated after processing each batch of words.
Output, state = Stacked_lstm (words[:, I], state)
# The rest of the code.
# ...
Final_state = State
Second, Seq2seq Models: Can be used for translation, dialogue, language generation and other scenarios
1. Documents involved:
seq2seq.py: Building Some libraries for the SEQ2SEQ model
SEQ2SEQ_MODEL.PY:SEQ2SEQ Neural Network model
data_utils.py: Preparing training data
translate.py: Start training SEQ2SEQ model
Structure of the 2.SEQ2SEQ model:
The basic structure is two parts, part is encoder input, the other part is decoder output
Use of the 3.SEQ2SEQ library:
Outputs, states = Basic_rnn_seq2seq (encoder_inputs, decoder_inputs, cell)
4. Model Explanation:
(1) bucketing (sub-barrels) and padding (padded)
In order to effectively solve the problem of sentence length inconsistency, the sentence length is divided into several categories
Buckets = [(5, 10), (10, 15), (20, 25), (40, 50)]
5. Training the model steps, take the screenplay dialogue data as an example:
(1) Format the data and then divide the training set and the test set
(2) Build a dictionary and then turn the sentence into Word IDs
(3) Define the super-parameter start training model
(4) using the model