Python implements a program that recognizes handwritten mnist digital sets

Source: Internet
Author: User

The first thing we need to do is get MNIST data. If you're a git??, then you can
By cloning the Code repository for this book to get the data,
Achieve our network to classify the numbers

git clone https://github.com/mnielsen/neural-networks-and-deep-learning.git

Class Network (object):d EF __init__ (Self, sizes): Self.num_layers = Len (sizes) self.sizes = Sizesself.biases = [ Np.random.randn (y, 1) for y in sizes[1:]]self.weights = [Np.random.randn (y, X) for x, y in Zip (sizes[:-1], sizes[1:])]

In this code, the list sizes contains the number of neurons in each layer. For example, if we want to create a.
2 neurons, the first layer has 3 neurons, the last layer has 1 neurons of the Network object, we should write this code:

NET = Network ([2, 3, 1])
The biases and weights in the Network object are randomly initialized, so? Numpy's Np.random.randn function?
The average value is 0 and the standard deviation is 1. This random initialization gives us a random gradient descent algorithm.
Point. In the later chapters we will find better initialization weights and biased methods, but before they are randomly initialized
Of. Note that the Network initialization code assumes that the first layer neuron is a loss layer and does not set any bias on these neurons,
Because the bias is only in the back of the layer? To calculate the output.


With this, it's easy to write code that calculates the output from a Network instance. We start by defining the S-type function:

def sigmoid (z): Return 1.0/(1.0+np.exp (-Z))
Attention, when you lose? When z is a vector or Numpy array, Numpy to the element? The sigmoid function, which is
In vector form.
We then add a Feedforward method to the network class, for a given? A, return the corresponding output
Out 6. What does this law do for each layer? Process (22):

def feedforward (Self, a): "" "Return the output of the network if" a "is input." "" For B, W in Zip (self.biases, self.weights): a = Sigmoid (Np.dot (w, a) +b) return a

Of course, the main thing we want the Network object to do is learn. And for that we give them? An implementation immediately drops the gradient
The SGD method of the algorithm. The code is as follows. Some of these things seem to be mysterious, I will be in the code after the analysis
def SGD (self, training_data, epochs, Mini_batch_size, Eta,test_data=none): "" "Train the neural network using Mini-batch St Ochasticgradient descent. The "Training_data" is a list of tuples "(x, Y)" representing the training inputs and the desiredoutputs. The other non-optional parameters areself-explanatory. If "Test_data" is provided then thenetwork'll be a evaluated against the test data after Eachepoch, and partial progress P Rinted out. This is useful fortracking progress and slows things down substantially. "" " If test_data:n_test = Len (test_data) n = Len (training_data) for J in Xrange (epochs): Random.shuffle (Training_data) mini_ Batches = [Training_data[k:k+mini_batch_size]for k in xrange (0, N, mini_batch_size)]for Mini_batch in mini_batches: Self.update_mini_batch (Mini_batch, ETA) if Test_data:print "Epoch {0}: {1}/{2}". Format (J, Self.evaluate (Test_data), N_ Test) Else:print "Epoch {0} complete". Format (j)

Training_data is a list of (x, y) tuples, a table, a training loss, and a corresponding desired output. Variable epochs and
Mini_batch_size as you'd expect--the number of iterations, and the time of sampling???。 of bulk data ETA is the learning rate,
η If the optional parameter test_data is given, the program evaluates the network after each trainer and prints out some progress.
This is very good for tracking progress, but it is quite slow to keep pace.

At each iteration, it randomly scrambles the training data and divides it into multiple appropriate?
? bulk data. This is a simple random sampling method from training data. And then for every mini_batch.
Shall we?? The secondary gradient drops. This is done by code Self.update_mini_batch (Mini_batch, ETA) and it is only
Only make? The training data in Mini_batch, based on the iteration of a single gradient descent, updates the weights and biases of the network. This is
Code for Update_mini_batch Law:



def update_mini_batch (self, Mini_batch, eta): "" Update the network's weights and biases by applyinggradient descent using BackPropagation to a single mini batch. The "Mini_batch" is a list of tuples "(x, Y)", and "ETA" are the learning rate. "" Nabla_b = [Np.zeros (b.shape) for b in self.biases]nabla_w = [Np.zeros (w.shape) for W "Self.weights]for X, y in MINI_BATC  H:delta_nabla_b, Delta_nabla_w = Self.backprop (x, y) nabla_b = [Nb+dnb for NB, dnb in Zip (Nabla_b, delta_nabla_b)]nabla_w = [Nw+dnw for NW, DNW in Zip (Nabla_w, delta_nabla_w)]self.weights = [W (Eta/len (Mini_batch)) *nwfor W, NW in Zip (self.weight S, nabla_w)]self.biases = [B (Eta/len (mini_batch)) *nbfor B, NB in Zip (self.biases, nabla_b)]

Part of this is done by this code:
Delta_nabla_b, Delta_nabla_w = Self.backprop (x, y)

What's This? A method called the inverse propagation algorithm, which quickly calculates the gradient of the cost function. So
Update_mini_batch's work is simply to calculate the gradient for each training sample in the Mini_batch, and then to properly
New Self.weights and Self.biases.
I'm not going to list Self.backprop's code now. We will learn how the reverse propagation is done in the next chapter, including
The Self.backprop code. Now, let's assume that it's in accordance with our request, to return to the training sample X-related costs
When the gradient


The Complete program

"" "Network.py~~~~~~~~~~a module to implement the stochastic gradient descent learningalgorithm for A feedforward Neural NE  Twork.  Gradients is calculatedusing backpropagation.  Note that I had focused on making the codesimple, easily readable, and easily modifiable. It is not optimized,and omits many desirable features. "" # # # libraries# Standard libraryimport random# third-party librariesimport numpy as Npclass Network (object): Def __init __ (self, sizes): "" "the list ' sizes ' contains the number of neurons in the respective layers of the networ K. For example, if the list is [2, 3, 1] then it would is a three-layer network with the first layer cont  Aining 2 Neurons, the second layer 3 neurons, and the third layer 1 neuron. The biases and weights for the network is initialized randomly, using a Gaussian distribution with mean 0,  and Variance 1. Note The first layer is a assumed to a input layer, and by conventionWe won ' t set any biases for those neurons, since biases is only ever used in computing the outputs from LAT        ER layers. "" "        Self.num_layers = Len (sizes) self.sizes = Sizes self.biases = [Np.random.randn (y, 1) for y in Sizes[1:]] Self.weights = [Np.random.randn (y, X) for x, y in Zip (sizes[:-1], sizes[1:])] def Feedforwar        D (Self, a): "" "Return the output of the network if" a "is input." "" For B, W in Zip (self.biases, self.weights): a = Sigmoid (Np.dot (w, a) +b) return a def SGD (self, Traini Ng_data, epochs, mini_batch_size, ETA, Test_data=none): "" "Train the neural network using Mini-batch sto  Chastic gradient descent. The ' Training_data ' is a list of tuples ' (x, y) ' representing the training inputs and the desired output  S. The other non-optional parameters is self-explanatory. If ' Test_data ' is provided then the network would beEvaluated against the test data after each epoch, and partial progress printed out.        This was useful for tracking progress, but slows things down substantially. "" If test_data:n_test = Len (test_data) n = Len (training_data) for J in Xrange (epochs): Random.shuf Fle (training_data) mini_batches = [Training_data[k:k+mini_batch_size] for k in X  Range (0, N, mini_batch_size)] for Mini_batch in Mini_batches:self.update_mini_batch (Mini_batch, ETA) if Test_data:print "Epoch {0}: {1}/{2}". Format (J, Self.evaluate (tes T_data), n_test) else:print "Epoch {0}". Format (j) def update_mini_batch (self, mini_ Batch, ETA): "" "Update the network ' s weights and biases by applying gradient descent using backpropagation t        o a single mini batch. The ' Mini_batch ' is a list of tuples ' (x, y) ', and ' ETA ' is the learning rate. "" Nabla_b = [Np.zeros (b.shape) for B in self.biases] nabla_w = [Np.zeros (w.shape) for W in Self.weights] for X , y in mini_batch:delta_nabla_b, delta_nabla_w = Self.backprop (x, y) nabla_b = [Nb+dnb for NB, DNB In Zip (Nabla_b, delta_nabla_b)] Nabla_w = [NW+DNW-NW, DNW in Zip (Nabla_w, delta_nabla_w)] Self.weigh TS = [W (Eta/len (mini_batch)) *nw for W, NW in Zip (self.weights, nabla_w)] self.biases = [B (        Eta/len (Mini_batch)) *nb for B, NB in Zip (self.biases, nabla_b)] def backprop (self, x, y):  "" "" Return a Tuple "(Nabla_b, Nabla_w)" representing the gradient for the cost function c_x. "Nabla_b" and "Nabla_w" is layer-by-layer lists of numpy arrays, similar to "self.biases" and "self.        Weights '. "" " Nabla_b = [Np.zeros (b.shape) for B in self.biases] nabla_w = [Np.zeros (w.shape) for W in SElf.weights] # Feedforward activation = x activations = [x] # list to store all the activations, Laye R by Layer ZS = [] # list to store all the z-vectors, layer by layer for B, w in Zip (self.biases, Self.weigh TS): Z = Np.dot (w, activation) +b zs.append (z) activation = sigmoid (z) activati Ons.append (activation) # backward Pass delta = self.cost_derivative (activations[-1], y) * Sigmoi D_prime (Zs[-1]) nabla_b[-1] = Delta Nabla_w[-1] = Np.dot (Delta, Activations[-2].transpose ()) # Note T  Hat the variable L in the loop below was used a little # differently to the notation in Chapter 2 of the book.  Here, # L = 1 means the last layer of neurons, L = 2 are the # Second-last layer, and so on. It's a renumbering of the # scheme in the book, used here to take advantage of the fact # that Python can us        e negative indices in lists. For L In xrange (2, self.num_layers): Z = zs[-l] sp = Sigmoid_prime (z) delta = np.dot (self.weight S[-l+1].transpose (), Delta) * SP NABLA_B[-L] = Delta Nabla_w[-l] = Np.dot (Delta, activations[-l-1].t Ranspose ()) return (Nabla_b, NABLA_W) def evaluate (self, test_data): "" "return the number of test inputs For which the neural network outputs the correct result. Note that the neural network's output is assumed to being the index of whichever neuron in the final layer have        The highest activation. "" " Test_results = [(Np.argmax (Self.feedforward (x)), y) for (x, y) in Test_data] return sum (int ( x = = y) for (x, y) in test_results) def cost_derivative (self, output_activations, y): "" "Return the vector of PA        Rtial derivatives \partial c_x/\partial A for the output activations. "" Return (OUTPUT_ACTIVATIONS-Y) # # # # # Miscellaneous functionsdef sigmoid (z): "" "ThE sigmoid function. "" "    Return 1.0/(1.0+np.exp (z)) def sigmoid_prime (z): "" "derivative of the sigmoid function." " return sigmoid (z) * (1-sigmoid (z))

"" "Mnist_loader~~~~~~~~~~~~a Library to load the mnist image data.  For details of the Datastructures is returned, see the doc strings for ' load_data ' and ' load_data_wrapper '. In practice, ' load_data_wrapper ' are thefunction usually called by our neural network code. "" # # # libraries# Standard libraryimport cpickleimport gzip# third-party librariesimport numpy as Npdef load_data (): "" "R    Eturn the MNIST data as a tuple containing the training data, the validation data, and the test data.    The ' Training_data ' is returned as a tuple with a entries.  The first entry contains the actual training images.  This was a numpy Ndarray with 50,000 entries. Each entry are, in turn, a numpy ndarray with 784 values, representing the. * = 784 pixels in a single MNIST ima    Ge.  The second entry in the ' Training_data ' "tuple is a numpy Ndarray containing 50,000 entries. Those entries is just the digit values (0...9) for the corresponding images contained inThe first entry of the tuple.    The ' validation_data ' and ' Test_data ' is similar, except each contains only images.  This was a nice data format, but the if use in neural networks it's helpful to modify the format of the ' Training_data ' a    Little.    That's done in the wrapper function ' Load_data_wrapper () ", see below. "" "F = gzip.open (' ... /data/mnist.pkl.gz ', ' RB ') Training_data, validation_data, test_data = Cpickle.load (f) f.close () Return (training _data, Validation_data, Test_data) def load_data_wrapper (): "" "Return a tuple containing" (Training_data, Validation_da TA, Test_data) '.    Based on "Load_data", but the "format is" more convenient for use with our implementation of neural networks.  In particular, "Training_data" is a list containing 50,000 2-tuples "(x, y) '.  ' X ' is a 784-dimensional numpy.ndarray containing the input image. ' Y ' is a 10-dimensional numpy.ndarray representing the unit vector corresponding to The correct digit for ' x '.  ' Validation_data ' and ' test_data ' are lists containing 2-tuples ' (x, y) '. In each case, "X" is a 784-dimensional numpy.ndarry containing the input image, and ' Y ' are the corresponding CLA    Ssification, i.e, the digit values (integers) corresponding to ' X '.  Obviously, this means we ' re using slightly different formats for the training data and the Validation/test data.    These formats turn out to being the most convenient for use with our neural network code. "" Tr_d, va_d, te_d = Load_data () training_inputs = [Np.reshape (x, (784, 1)) for x in tr_d[0]] Training_results = [Vect Orized_result (y) for y in tr_d[1]] Training_data = Zip (training_inputs, training_results) validation_inputs = [Np.re Shape (x, (784, 1)) for x in va_d[0]] Validation_data = Zip (validation_inputs, va_d[1]) test_inputs = [Np.reshape (x, (784, 1)) For x in te_d[0]] Test_data = Zip (test_inputs, te_d[1]) return (training_datA, Validation_data, Test_data) def Vectorized_result (j): "" "Return a 10-dimensional unit vector with a 1.0 in the jth  Position and zeroes elsewhere.    This was used to convert a digit (0...9) into a corresponding desired output from the neural network. "" E = Np.zeros ((ten, 1)) e[j] = 1.0 return E


# test network.py    "cost function square func" Import Mnist_loadertraining_data, validation_data, test_data = Mnist_ Loader.load_data_wrapper () Import networknet = Network.network ([784,  ten]) net. SGD (Training_data, 5, 5.0, Test_data=test_data)








Python implements a program that recognizes handwritten mnist digital sets

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.