First the PO on the main Python code (2.7), this code can be found on the deep learning.
1 # Allocate symbolic variables for the data 2 index = T.lscalar () # Index to a [mini]batch 3 x = T.matrix (' x ') # The data is presented as rasterized images 4 y = t.ivector (' y ') # The labels is presented as 1D vector of 5 # [INT] Labels 6 7 # Construct the logistic regression Class 8 # Each MNIST image have Si Ze 28*28 9 classifier = logisticregression (input=x, n_in=24 *, n_out=10) Ten # The cost we minimize during TRA Ining is the negative log likelihood OF12 # The model in symbolic format13 cost = Classifier.negative_log_likeliho Od (y) # compiling a Theano function that computes the mistakes that is made BY16 # the model on a MINIBATCH1 7 Test_model = theano.function (inputs=[index],18 outputs=classifier.errors (y), givens={20 X:test_set_x[index * Batch_size: (index + 1) * batch_size],21 Y:test_set_y[index * batch_s Ize: (index + 1) * Batch_size]}) Validate_model = Theano.function (inputs=[index],24 outputs=classifier.errors (y), 25 Givens={26 X:valid_set_x[index * batch_size: (index + 1) * batch_size],27 y:valid_set_ Y[index * Batch_size: (index + 1) * Batch_size]}) # Compute the gradient of cost with respect to Theta = (w,b) 30 g_w = T.grad (Cost=cost, Wrt=classifier. W) G_b = T.grad (Cost=cost, Wrt=classifier.b), # Specify how to update the parameters of the model as a list OF34 # (variable, update expression) pairs.35 updates = [(classifier. W, classifier. W-learning_rate * g_w), (classifier.b, Classifier.b-learning_rate * g_b)]37 # Compiling a Thea No function ' Train_model ' that returns the cost, but in39 # the same time updates the parameter of the model based on The RULES40 # defined in ' updates ' Train_model = theano.function (inputs=[index],42 outputs=cost,43 Updates=updates,44 givens={45 X:train_set_x[index * batch_size: (index + 1) * batch_size],46 Y:train_set_y[index * Batch_size: (index + 1) * Batch_size]})
The code length is not too long, but the logical relationship needs to be clarified. This code is parsed on a row-by-line basis.
T in the code is synonymous with Theano.tensor.
Line 1~ Line 13:
# Allocate symbolic variables for the data index = T.lscalar () # Index to a [mini]batch x = T.matrix (' x ') # The data is presented as rasterized images y = t.ivector (' y ') # The labels is presented as 1D vector of # [ int] Labels # construct the logistic regression class # each MNIST image has a size 28*28 classifier = Logisti Cregression (Input=x, n_in=24 *, n_out=10) # The cost we minimize during training is the negative log likelihood of # The model in symbolic format cost = Classifier.negative_log_likelihood (y)
Declare index, x, y three symbolic variables (like the MATLAB symbol), respectively, to refer to the Training sample batch sequence number, input image matrix, the desired output vector.
Classifier is an LR object that invokes the constructor of the LR class and uses the symbolic variable x as input, we can use the Theano.function method to construct the contact in X and classifier, and the classifier will change when x changes.
Cost refers to the negative logarithm similarity in classifier, using the symbolic variable y as input, here the role and classifier are the same, no longer repeat.
Line 14~ Line 28:
# Compiling a Theano function that computes the mistakes that is made by # The model on a minibatch Test_model = Theano.function (Inputs=[index], outputs=classifier.errors (y), givens={ x:test_set_x[index * Batch_ Size: (index + 1) * Batch_size], y:test_set_y[index * batch_size: (index + 1) * Batch_size]}) Validate_model = th Eano.function (Inputs=[index], outputs=classifier.errors (y), givens={ x:valid_set_x[index * Batch_ Size: (index + 1) * Batch_size], y:valid_set_y[index * batch_size: (index + 1) * Batch_size]})
The 2 model here is a confusing place, and about theano.function requires some basic knowledge:
For example, declare 2 symbolic variables A, B: A, B = t.iscalar (), T.iscalar (), they are all shaped (i) scalar (scalar), and then declare a variable c: c = A + B, we look at its type by type (c): c1>
>>> type (c) <class ' theano.tensor.var.TensorVariable ' >>>> type (a) <class ' Theano.tensor.var.TensorVariable ' >
The type of C and a, B are the same, are tensor variables. Now that the preparation is complete, we build the relationship through Theano.function: add = theano.function (inputs = [A, b], output = c). This statement constructs a function add, which receives a, B as input, The output is C. We can use it in Python like this:
>>> add = theano.function (inputs = [A, b], outputs = c) >>> test = Add (+) >>> Testarray (20 0)
Well, with the basics, you can understand the meaning of these 2 model:
Test_model = Theano.function (Inputs=[index], outputs=classifier.errors (y), givens={ x:test_set_x[ Index * batch_size: (index + 1) * Batch_size], y:test_set_y[index * batch_size: (index + 1) * Batch_size]})
The input is index, and the output is the return value of the errors method in the Classifier object, where y is the input parameter of the errors method. Where the classifier receives x as the input parameter.
The Givens keyword is used to replace the variable preceding the colon with the variable following the colon, which in this case replaces X and Y with the index batch of data in the test data (a batch of batch_size).
Test_model in Chinese to explain is: Receive the index batch test data image data x and expected output y as input, return the error value function.
Validate_model = Theano.function (Inputs=[index], outputs=classifier.errors (y), givens={ x:valid_set_ X[index * Batch_size: (index + 1) * Batch_size], y:valid_set_y[index * batch_size: (index + 1) * Batch_size]})
This is the same as above, except that the validation data is used.
Line 29~ Line 32:
# Compute the gradient of cost with respect to Theta = (w,b) g_w = T.grad (Cost=cost, Wrt=classifier. W) G_b = T.grad (Cost=cost, wrt=classifier.b)
The gradient is calculated for the learning algorithm, and T.grad (y, x) calculates the gradient of y relative to X.
Line 33~ Line 37:
# Specify how to update the parameters of the model as a list of # (variable, update expression) pairs. Updates = [(classifier. W, classifier. W-learning_rate * g_w), (classifier.b, Classifier.b-learning_rate * g_b)]
Updates is a list of length 2, each element is a set of tuple, in theano.function, each time the corresponding function is called, the second element in the tuple is used to update the first element.
Line 38~ Line 46:
# Compiling a Theano function ' Train_model ' returns the cost, but in # the same time updates the parameter of the Model based on the Rules # defined ' updates ' Train_model = theano.function (Inputs=[index], outputs=cost, updates=updates, givens={ x:train_set_x[index * batch_size: (index + 1) * Batch_size], y:train_set_y [Index * batch_size: (index + 1) * Batch_size]})
The rest of the section is no longer mentioned. It is important to note that an update parameter is added, which is given the modification of some parameters (W, b) for each call to Train_model. The output also becomes the cost function (logarithmic error) rather than the Errors function (absolute error) in Test_model and Valid-model.
[Deep Learning] Python/theano Code Analysis of implementing logistic regression Network