Sequence classification, which predicts the category labels for the entire input sequence. Sentiment analysis, predict the user to write the text topic attitude. Predict election results or product and movie ratings.
International Film Database (international movie) Film Critics DataSet. The target value is two yuan, positive or negative. Language is a large number of negative, irony, fuzzy, can not only see whether the word appears. Construct word vector loop network, view each comment by word, and the last word value training to predict the whole comment emotion classifier.
IMDB film review Data set: http://ai.stanford.edu/~amaas/data/sentiment/at the AI Lab at the University of the Fox. Compress the tar document, positive negative comments are obtained from two folder text files. Use regular expressions to extract plain text, and all letters to lowercase.
Word vector embedding means richer than single-hot coded words. The glossary determines the word index and finds the correct word vector. The sequence fills the same length, and multiple movie review data is fed into the network.
The sequence callout model, passing in two placeholders, one input data or sequence, two target values target or mood. Incoming configuration parameter params object, optimizer.
Dynamically calculates the length of the current batch data series. Data in single tensor form, each sequence is 0 of the longest movie review length. Absolute maximum reduced word vector. 0 vector, scalar 0. Real word vector, scalar greater than 0 real. Tf.sign () is 0 or 1 discrete. The results are added along time steps to get the sequence length. The tensor length is the same as the batch data capacity, and the scalar represents the sequence length.
Use the params object to define the cell type and number of units. The Length property specifies the maximum number of rows to provide to the RNN for batch data. Gets the last active value of each sequence and feeds into the Softmax layer. Depending on the length of each film review, the batch data has different indices for each sequence RNN the last correlated output activity. The index is established in the Time Step dimension (batch Data shape sequences*time_steps*word_vectors). Tf.gather () indexes along the 1th dimension. The output active value shape Sequences*time_steps*word_vectors The first two dimensions of flattening (flatten), adding the sequence length. Add Length-1 and select the last available time step.
Gradient clipping, the gradient value is limited to a reasonable range. A meaningful cost function can be used in any classification, and the model output can be used for all categories of probability distributions. Increased gradient cropping (gradient clipping) improves learning results and limits maximum weight updates. RNN training difficult, different parameters with improper, the weight is very easy to divergence.
The TensorFlow supports the optimizer instance compute_gradients function deduction, modifies the gradient, and the apply_gradients function applies the weight value change. Gradient component is less than-limit, set-limit, gradient component is limit, set limit. The TensorFlow derivative can be none, indicating that a variable is not related to the cost function, that it should be zero vector mathematically, but none is conducive to internal performance optimization, simply return the value of None.
The film critic feeds the recurrent neural network one by one, and each time step consists of a word vector to form batch data. The batched function looks up the word vector, and all the sequence lengths are padded. Train the model, define the hyper-parameters, load the dataset and Word vectors, and run the model through the preprocessing training batch data. The successful training of model depends on network structure, hyper-parameter and word vector quality. Pre-trained word vectors can be loaded from the Skip-gram Model Word2vec project (), the Stanford NLP Research Group Glove Model (HTTPS://NLP.STANFORD.EDU/PROJECTS/GLOVE).
Kaggle Open Learning Contest (), IMDB film review data, compare predictions with others.
Import tarfileimport refrom Helpers import downloadclass imdbmoviereviews:default_url = \ ' HTTP://AI.STANFORD.E du/~amaas/data/sentiment/aclimdb_v1.tar.gz ' Token_regex = re.compile (R ' [a-za-z]+|[!?.:, ()] ') def __init__ (self, Cache_dir, url=none): Self._cache_dir = cache_dir self._url = URL or type (self). Default_urldef __iter__ (self): filepath = Download (Self._url, Self._cache_dir) with Tarfile.open (fil Epath) as archive:for filename in Archive.getnames (): If Filename.startswith (' aclimdb/train/pos/'): Yield Self._read ( Archive, filename), Trueelif filename.startswith (' aclimdb/train/neg/'): Yield self._read (archive, filename), falsedef _read (self, archive, filename): with archive.extractfile (filename) as File_: data = File_.read () . Decode (' utf-8 ') data = type (self). Token_regex.findall (data) data = [X.lower () for x in Data]return dataimport bz2import numpy as Npclass Embe Dding:def __init__ (Self, VOcabulary_path, Embedding_path, length): self._embedding = Np.load (Embedding_path) with Bz2.open (VOC Abulary_path, ' RT ') as File_: Self._vocabulary = {K.strip (): I for I, K in enumerate (file_)} sel F._length = Lengthdef __call__ (self, sequence): Data = Np.zeros ((self._length, self._embedding.shape[1])) indices = [Self._vocabulary.get (x, 0) for x in sequence] embedded = self._embedding[indices] D Ata[:len (sequence)] = Embeddedreturn data @propertydef dimensions (self): return self._embedding.shape[1]import Tenso Rflow as Tffrom helpers import Lazy_propertyclass sequenceclassificationmodel:def __init__ (self, data, target, params): Self.data = Data Self.target = Target Self.params = params self.prediction Self.cost self.error self.optimize @lazy_propertydef Length (self): used = TF . sign (Tf.reduce_max (TF.ABS (Self.data), reduction_indices=2)) length = Tf.reduce_sum (used, reduction_indices=1) length = TF.C AST (length, tf.int32) return length @lazy_propertydef prediction (self): # recurrent network.output, _ = tf.nn.dynamic _rnn (Self.params.rnn_cell (Self.params.rnn_hidden), Self.data, dtype=tf.float (Sequence_length=self.length,) last = self._last_relevant (output, self.length) # S Oftmax layer.num_classes = Int (Self.target.get_shape () [1]) weight = tf. Variable (Tf.truncated_normal ([Self.params.rnn_hidden, num_classes], stddev=0.01)) bias = tf. Variable (Tf.constant (0.1, shape=[num_classes])) prediction = Tf.nn.softmax (Tf.matmul (last, weight) + bias) retur N Prediction @lazy_propertydef Cost (self): Cross_entropy =-tf.reduce_sum (Self.target * Tf.log (self.pred iction)) return cross_entropy @lazy_propertydef error (self): mistakes = Tf.not_equal (Tf.argmax (Self.target, 1), Tf.argmax (self.prediction, 1)) return tf.redu Ce_mean (Tf.cast (mistakes, tf.float32)) @lazy_propertydef optimize (self): gradient = Self.params.optimize R.compute_gradients (self.cost) try:limit = self.params.gradient_clipping gradient = [ (Tf.clip_by_value (g,-limit, limit), V) if G is not None of else (None, V) for G, V in Gradient]except attributeer Ror:print (' No gradient clipping parameter specified. ') optimize = self.params.optimizer.apply_gradients (gradient) return optimize @staticmethoddef _last_relevant (output, L Ength): batch_size = tf.shape (Output) [0] max_length = Int (Output.get_shape () [1]) output_s ize = Int (Output.get_shape () [2]) index = tf.range (0, batch_size) * max_length + (length-1) flat = Tf.reshape (output, [-1, output_size]) Relevant = Tf.gather (flat, Index) return Relevantimport TensorFlow as Tffrom Helpers import attrdictfrom embedding import Embeddingfrom imdbmoviereviews I Mport imdbmoviereviewsfrom preprocess_batched Import preprocess_batchedfrom Sequenceclassificationmodel Import Sequenceclassificationmodel imdb_download_dir = './imdb ' Wiki_vocab_dir = '. /01_wikipedia/wikipedia ' Wiki_embed_dir = '. /01_wikipedia/wikipedia ' params = attrdict (Rnn_cell=tf.contrib.rnn.grucell, rnn_hidden=300, optimizer =tf.train.rmspropoptimizer (0.002), batch_size=20,) reviews = Imdbmoviereviews (imdb_download_dir) length = Max (Len (x[0]) for x in reviews) embedding = embedding (Wiki_vocab_dir + '/vocabulary.bz2 ', Wiki_embed_di R + '/embeddings.npy ', length) batches = preprocess_batched (reviews, length, embedding, params.batch_size) data = TF . placeholder (Tf.float32, [None, Length, embedding.dimensions]) target = Tf.placeholder (Tf.float32, [None, 2]) model = Sequenceclassificationmodel (data, target, params) Sess = tf. Session () Sess.run (Tf.initialize_all_variables ()) for index, batch in enumerate (batches): Feed = {Data:batch[0], TARGET:BATCH[1]} error, _ = Sess.run ([Model.error, model.optimize], feed) print (' {}: {: 3.1f}% '. Format (index + 1, 1 (XX * error))
Resources:
"TensorFlow Practice for Machine Intelligence"
Welcome to add me to Exchange: Qingxingfengzi
My public number: Qingxingfengzigz
My wife Zhang Yuqing's public number: Qingqingfeifangz