TensorFlow implements AutoEncoder self-encoder,
I. Overview
AutoEncoder is a learning method that compresses and downgrades the high-dimensional features of data, and then undergoes the opposite decoding process. The final result obtained by decoding is compared with the original data during the learning process. The loss function is reduced by modifying the weight offset parameter, which continuously improves the ability to restore the original data. After learning is completed, the encoding process of the first half shows the low-dimensional "feature value" of the original data ". The self-Encoder model obtained through learning can compress high-dimensional data to the expected dimension. The principle is similar to that of PCA.
Ii. Model Implementation
1. AutoEncoder
First, the MNIST dataset is used to compress features, decompress features, and visually compare the extracted data with the original data.
First look at the Code:
Import tensorflow as tf import numpy as np import matplotlib. pyplot as plt # import MNIST data from tensorflow. examples. tutorials. mnist import input_data mnist = input_data.read_data_sets ("MNIST_data/", one_hot = False) learning_rate = 0.01 training_epochs = 10 batch_size = 256 display_step = 1 examples_to_show = 10 n_input = 784 # tf Graph input (only pictures) X = tf. placeholder ("float", [None, n_input]) # Dictionary-based storage of the parameters n_hidden_1 = 256 # number of neurons in the first encoding layer n_hidden_2 = 128 # number of neurons in the second encoding layer # Changes in weights and bias in the encoding and decoding Layers the order is inverse # The weight parameter matrix dimension is the input/output of each layer, the offset parameter dimension depends on the number of units in the output layer. weights = {'encoder _ h1 ': tf. variable (tf. random_normal ([n_input, n_hidden_1]), 'encoder _ h2 ': tf. variable (tf. random_normal ([n_hidden_1, n_hidden_2]), 'decoder _ h1 ': tf. variable (tf. random_normal ([n_hidden_2, n_hidden_1]), 'decoder _ h2 ': tf. variable (tf. random_no Rmal ([n_hidden_1, n_input]),} biases = {'encoder _ b1 ': tf. variable (tf. random_normal ([n_hidden_1]), 'encoder _ b2': tf. variable (tf. random_normal ([n_hidden_2]), 'decoder _ b1 ': tf. variable (tf. random_normal ([n_hidden_1]), 'decoder _ b2': tf. variable (tf. random_normal ([n_input]),} # each layer structure is xW + B # Build the encoder def encoder (x): layer_1 = tf. nn. sigmoid (tf. add (tf. matmul (x, weights ['encoder _ h1 ']), biases ['encod Er_b1 ']) layer_2 = tf. nn. sigmoid (tf. add (tf. matmul (layer_1, weights ['encoder _ h2 ']), biases ['encoder _ b2']) return layer_2 # Build decoder def decoder (x): layer_1 = tf. nn. sigmoid (tf. add (tf. matmul (x, weights ['decoder _ h1 ']), biases ['decoder _ b1']) layer_2 = tf. nn. sigmoid (tf. add (tf. matmul (layer_1, weights ['decoder _ h2 ']), biases ['decoder _ b2']) return layer_2 # build model encoder_op = encoder (X) decoder_op = decoder (Encoder_op) # predict y_pred = decoder_op y_true = X # define the cost function and optimizer cost = tf. performance_mean (tf. pow (y_true-y_pred, 2) # Least Squares optimizer = tf. train. adamOptimizer (learning_rate ). minimize (cost) with tf. session () as sess: # tf. initialize_all_variables () no long valid from #2017-03-02 if using tensorflow> = 0.12 if int (tf. _ version __). split ('. ') [1]) <12 and int (tf. _ version __). split ('. ') [0]) <1: init = t F. initialize_all_variables () else: init = tf. global_variables_initializer () sess. run (init) # calculate the total number of batches to ensure that each sample in each cyclic training set participates in the training, unlike the batch training total_batch = int (mnist. train. num_examples/batch_size) # Total number of batches for epoch in range (training_epochs): for I in range (total_batch): batch_xs, batch_ys = mnist. train. next_batch (batch_size) # max (x) = 1, min (x) = 0 # Run optimization op (backprop) and cost op (to get loss value)) _, C = sess. run ([optimizer, cost], feed_dict = {X: batch_xs}) if epoch % display_step = 0: print ("Epoch :", '% 04d' % (epoch + 1), "cost = ","{:. 9f }". format (c) print ("Optimization Finished! ") Encode_decode = sess. run (y_pred, feed_dict = {X: mnist. test. images [: examples_to_show]}) f, a = plt. subplots (2, 10, figsize = (10, 2) for I in range (examples_to_show): a [0] [I]. imshow (np. reshape (mnist. test. images [I], (28, 28) a [1] [I]. imshow (np. reshape (encode_decode [I], (28, 28) plt. show ()
Code explanation:
First, import the various databases and datasets to be used and define the parameters, such as the learning rate and number of training iterations, to facilitate later modification. Because the neural network structure of the Self-encoder is quite regular and both are xW + B, the weights of each layer W and the variables of biasing B are tf. variable is placed in a dictionary in a unified manner, and the key value of the dictionary is more clearly described. In terms of Model Construction ideas, the encoder part and the decoder part are constructed separately. The activation function of each layer uses the Sigmoid function. The encoder usually uses the same activation function as the encoder. Generally, the encoder part and the decoder part are an inverse process. For example, we design an encoder that reduces 784 dimensions to 256 dimensions and then to 128 dimensions, the decoder is decoded from 128 dimension to 256 dimension and then to 784 dimension. Defines the cost function. The cost function is represented by the decoder output and the least square expression of the original input. The optimizer uses the AdamOptimizer training phase to process all training data in each cycle. After training, the training results are compared with the original data visualization. For example, the restoration degree is high. If you increase the number of training cycles or the number of layers of the Self-encoder, you can achieve better restoration.
Running result:
2. Encoder
The working principle of Encoder is the same as that of AutoEncoder. We visualize the low-dimensional feature values produced by encoding in a low-dimensional space to visually display the clustering effect of data. Specifically, the 784-dimension MNIST data is gradually reduced from 784 to 128 to 64 to 10 to 2. in the 2-dimension coordinate system, an example is displayed, in the last layer of the encoder, we do not use the Sigmoid activation function, but use the default linear activation function to make the output (-∞, + ∞ ).
Complete code:
Import tensorflow as tf import matplotlib. pyplot as plt from tensorflow. examples. tutorials. mnist import input_data mnist = input_data.read_data_sets ("MNIST_data/", one_hot = False) learning_rate = 0.01 training_epochs = 10 batch_size = 256 display_step = 1 n_input = 784 xtf. placeholder ("float", [None, n_input]) n_hidden_1 = 128 n_hidden_2 = 64 n_hidden_3 = 10 n_hidden_4 = 2 weights = {'encoder _ h1 ': Tf. variable (tf. truncated_normal ([n_input, n_hidden_1],), 'encoder _ h2 ': tf. variable (tf. truncated_normal ([n_hidden_1, n_hidden_2],), 'encoder _ h3 ': tf. variable (tf. truncated_normal ([n_hidden_2, n_hidden_3],), 'encoder _ h4 ': tf. variable (tf. truncated_normal ([n_hidden_3, n_hidden_4],), 'decoder _ h1 ': tf. variable (tf. truncated_normal ([n_hidden_4, n_hidden_3],), 'decoder _ h2 ': tf. variable (tf. truncated _ Normal ([n_hidden_3, n_hidden_2],), 'decoder _ h3 ': tf. variable (tf. truncated_normal ([n_hidden_2, n_hidden_1],), 'decoder _ h4 ': tf. variable (tf. truncated_normal ([n_hidden_1, n_input],),} biases = {'encoder _ b1 ': tf. variable (tf. random_normal ([n_hidden_1]), 'encoder _ b2': tf. variable (tf. random_normal ([n_hidden_2]), 'encoder _ b3': tf. variable (tf. random_normal ([n_hidden_3]), 'encoder _ b4 ': tf. variable (t F. random_normal ([n_hidden_4]), 'decoder _ b1 ': tf. variable (tf. random_normal ([n_hidden_3]), 'decoder _ b2': tf. variable (tf. random_normal ([n_hidden_2]), 'decoder _ b3': tf. variable (tf. random_normal ([n_hidden_1]), 'decoder _ b4 ': tf. variable (tf. random_normal ([n_input]),} def encoder (x): layer_1 = tf. nn. sigmoid (tf. add (tf. matmul (x, weights ['encoder _ h1 ']), biases ['encoder _ b1']) layer_2 = tf. nn. sigmoid (tf. Add (tf. matmul (layer_1, weights ['encoder _ h2 ']), biases ['encoder _ b2']) layer_3 = tf. nn. sigmoid (tf. add (tf. matmul (layer_2, weights ['encoder _ h3 ']), biases ['encoder _ b3']) # To facilitate the output of the encoding layer, the subsequent layers of the encoding layer do not use the activation function layer_4 = tf. add (tf. matmul (layer_3, weights ['encoder _ h4 ']), biases ['encoder _ b4']) return layer_4 def decoder (x): layer_1 = tf. nn. sigmoid (tf. add (tf. matmul (x, weights ['decoder _ h1 ']), biases ['decoder _ b1']) Layer_2 = tf. nn. sigmoid (tf. add (tf. matmul (layer_1, weights ['decoder _ h2 ']), biases ['decoder _ b2']) layer_3 = tf. nn. sigmoid (tf. add (tf. matmul (layer_2, weights ['decoder _ h3 ']), biases ['decoder _ b3']) layer_4 = tf. nn. sigmoid (tf. add (tf. matmul (layer_3, weights ['decoder _ h4 ']), biases ['decoder _ b4']) return layer_4 encoder_op = encoder (X) decoder_op = decoder (encoder_op) y_pred = decoder_op y_true = X cost = Tf. performance_mean (tf. pow (y_true-y_pred, 2) optimizer = tf. train. adamOptimizer (learning_rate ). minimize (cost) with tf. session () as sess: # tf. initialize_all_variables () no long valid from #2017-03-02 if using tensorflow> = 0.12 if int (tf. _ version __). split ('. ') [1]) <12 and int (tf. _ version __). split ('. ') [0]) <1: init = tf. initialize_all_variables () else: init = tf. global_variables_initializer () Sess. run (init) total_batch = int (mnist. train. num_examples/batch_size) for epoch in range (training_epochs): for I in range (total_batch): batch_xs, batch_ys = mnist. train. next_batch (batch_size) # max (x) = 1, min (x) = 0 _, c = sess. run ([optimizer, cost], feed_dict = {X: batch_xs}) if epoch % display_step = 0: print ("Epoch :", '% 04d' % (epoch + 1), "cost = ","{:. 9f }". format (c) print ("Optimization Finishe D! ") Encoder_result = sess. run (encoder_op, feed_dict = {X: mnist. test. images}) plt. scatter (encoder_result [:, 0], encoder_result [:, 1], c = mnist. test. labels) plt. colorbar () plt. show ()
Experiment results:
The results show that the two-dimensional encoding feature has a good clustering effect. Each color in the figure represents a number, and the clustering is good.
Of course, the results obtained in this experiment are just a brief introduction to AutoEncoder. To get the expected results, we should also design a more complex self-Encoder structure to obtain better differentiation features.
The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.