Placehoder Data type:
x = Tf.placeholder (Tf.float32, Shape=[none, 784])
Represents the size of the Placehoder, any number of rows, and only requires a number of 784 columns
Placehoder self-adjusting the size of the data
Variable data type:
W = tf. Variable (Tf.zeros ([784,10]))
Variable must be initialized manually before it can be used, and its initialization function is Tf.global_variables_initializer (), however this function is to be run and also useful sess.run () encapsulation: Sess.run (Tf.global_ Variables_initializer ())
To define a calculation diagram:
Write the equation of calculation and put the matrix of the data in:
y = Tf.matmul (x,w) + b
Definition loss
loss=tf.nn.softmax_cross_entropy_with_logits (Labels=y_, logits=y)
Define Optimization objectives: sample average loss
Cross_entropy = Tf.reduce_mean (
Tf.nn.softmax_cross_entropy_with_logits (Labels=y_, logits=y))
Set the training mode:
To optimize the target, select the optimization direction:
Minimize (cross_entropy)
Select the optimization algorithm in the train class (such as gradient descent, the step is determined by the parameters):
train_step = Tf.train.GradientDescentOptimizer (0.5). Minimize (Cross_entropy)
Running training:
Train_step.run ()
This code will run one time.
In batches of multiple calculations, the loop should be used, and the data is divided into batches (batch), the method is built in batches:
Batch = Mnist.train.next_batch (100)
When running, you must feed into batch input as follows:
For _ in range:
batch = Mnist.train.next_batch (
train_step.run) (Feed_dict={x:batch[0], Y_: batch[1]} )
Evaluate the Model:
Argmax Module
Argmax (y,1) gives the most probable (prossiable=1) value of each output in Y (predicted y)
Then use the equal () function to determine whether they are equal
Correct_prediction = Tf.equal (Tf.argmax (y,1), Tf.argmax (y_,1))
will return a bool list
Convert to floating-point number (cast module) and then average (Reduce_mean module)
accuracy = Tf.reduce_mean (Tf.cast (correct_prediction, Tf.float32))
The basic model is complete. Code Download Address:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/mnist/mnist_softmax.py
Next go into the more powerful models:
Initialization of weights:
For weights with Truncated_normal
Generates a group of Relu with a size of shape and a positive distribution
and turn into variable type
def weight_variable (Shape):
Initial = Tf.truncated_normal (Shape, stddev=0.1)
Return TF. Variable (initial)
For bias, generate a fixed length:
def bias_variable (Shape):
Initial = Tf.constant (0.1, Shape=shape)
Return TF. Variable (initial)
Convolution and pooling layers:
NN module with convolution and pooling method;
def conv2d (x, W):
return tf.nn.conv2d (x, W, strides=[1, 1, 1, 1], padding= ' same ')
def max_pool_2x2 (x):
return Tf.nn.max_pool (x, Ksize=[1, 2, 2, 1),
Strides=[1, 2, 2, 1], padding= ' same ')
Implement the first layer of convolutional layer:
To initialize the definition of the convolution weight:
The first three dimensions represent the input dimension
5,5 for patch size (image block size), 1 for input (color) channels, 32 for output neurons (number of feature)
W_CONV1 = Weight_variable ([5, 5, 1, 32])
B_CONV1 = Bias_variable ([32])
Rotate the input Image:
X_image = Tf.reshape (x, [ -1,28,28,1])
Use the max_pool_2x2 method for 2x2 Max pooling
H_CONV1 = Tf.nn.relu (conv2d (X_image, W_CONV1) + b_conv1)
H_pool1 = max_pool_2x2 (H_CONV1)
Second floor:
Thus each 5x5 image block will output 64 feature
W_conv2 = Weight_variable ([5, 5, 32, 64])
B_conv2 = Bias_variable ([64])
H_conv2 = Tf.nn.relu (conv2d (h_pool1, w_conv2) + b_conv2)
H_pool2 = max_pool_2x2 (h_conv2)
Full link layer:
1024 of neurons
W_FC1 = Weight_variable ([7 * 7 * 64, 1024])
B_FC1 = Bias_variable ([1024])
H_pool2_flat = Tf.reshape (H_pool2, [-1, 7*7*64])
H_FC1 = Tf.nn.relu (Tf.matmul (H_pool2_flat, W_FC1) + b_fc1)
Dropout:
The NN module has a dropout method that directly encapsulates the entire network:
Keep_prob = Tf.placeholder (Tf.float32)
H_fc1_drop = Tf.nn.dropout (H_FC1, Keep_prob)
Output layer:
W_FC2 = weight_variable ([1024, 10])
B_FC2 = Bias_variable ([10])
Y_conv = Tf.matmul (H_fc1_drop, W_FC2) + B_FC2
The first of the W is the number of the previous layer, the second one is the number of its own
Do some optimizations:
Replace GD with Adam (gradient descent):
Adamoptimizer (1e-4)
Add dropout's keep_prob to feed_dict:
Feed_dict={x:batch[0], Y_: batch[1], keep_prob:0.5
Output results per 100 times: