Detailed Neural network theory Foundation and Python implementation method

Last Update:2017-12-18 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Artificial neural Network (ANN) is a mathematical model of distributed parallel information processing, which mimics the behavioral characteristics of animal neural networks. This kind of network relies on the complexity of the system, by adjusting the connections between the large number of nodes, so as to achieve the purpose of processing information, and has the ability of self-learning and self-adaptation. This article mainly introduced the Neural Network Theory Foundation and the Python realization elaboration, has the certain reference value, needs the friend may refer under, hoped can help everybody.

First, multilayer feedforward neural network

The multilayer Feedforward Neural network consists of three parts: the output layer, the hidden layer and the output layer, each layer is composed of elements;

The input layer is passed by the instance eigenvector of the training set, passing through the weight of the connecting node into the next layer, the output of the previous layer is the input of the next layer, the number of hidden layers is arbitrary, the input layer is only one layer, and the output layer is only one layer;

Besides the input layer, the number of layers of hidden layer and output layer and N, the neural network is called N-layer neural network, such as 2-layer neural network;

A weighted sum of one layer is used to transform the output according to the nonlinear equation. Theoretically, if there are enough hidden layers and large enough training sets, any equation can be simulated;

Ii. Designing neural network structures

Before using neural networks, it is necessary to determine the number of layers of the neural network and the number of units per layer.

In order to accelerate the learning process, eigenvectors usually need to be normalized between 0 and 1 before the input layer is passed in.

Discrete variables can be encoded into a value that can be assigned to each input unit corresponding to a characteristic value

For example: Eigenvalue a May go to three values (A0,A1,A2), then 3 input units can be used to represent an

If a=a0, then the cell value of A0 is 1, the remainder takes 0;
If A=A1, then the cell value of A1 is 1, the remainder takes 0;
If A=A2, then the cell value of A2 is 1, the remainder takes 0;

The neural network solves both the classification (classification) problem and the regression (regression) problem. For classification problems, if there are two classes, you can use an output unit (0 and 1) to represent two classes, and if the extra two classes, each category is represented by an output unit, so the number of units in the output layer is usually the number of categories.

There is no clear rule to design the best number of hidden layers, generally according to the experimental test error and accuracy rate to improve the experiment.

Three, cross-validation method

How to calculate the accuracy rate? The simplest way is through a set of training sets and test sets, the training set through training to get the model, the test set input model to get the test results, the test results and test set of the real label to compare, get accurate rate.

A common approach in machine learning is the cross-validation approach. A set of data is not divided into 2 parts, possibly divided into 10 parts,

1th time: 1th as a test set, the remaining 9 as a training set;
2nd time: 2nd as a test set, the remaining 9 as a training set;
......

So after 10 training, get 10 groups of accuracy, the 10 groups of data averaged to obtain the average accuracy of the results. Here 10 is a special case. In general, the data is divided into K-parts, called the algorithm is k-foldcrossvalidation, that is, each time the selection of a part of K as a test set, the remaining k-1 as a training set, repeat K times, the final average accuracy rate, is a more scientific and accurate method.

Four, BP algorithm

The instance of the training set is processed by iteration;

The difference between the predicted value and the real value after the neural network is compared.

The inverse direction (from the output layer and the hidden layer and the input layer) to minimize the error, to update the weight of each connection;

4.1, the algorithm detailed introduction

Input: Data set, learning rate, a multilayer neural network architecture;
Output: A well-trained neural network;

Initialize weights and biases: random initialization between 1 and 1 (or other), with one bias per unit; For each training instance x, perform the following steps:

1. Forward transmission from the input layer:

Combined with a neural network for analysis:

From the input layer to the hidden layer:

From the hidden layer to the output layer:

A summary of two formulas can be obtained:

IJ is the current layer cell value, OI is the cell value of the previous layer, wij is between two layers, connecting two cell values of the weight value, Sitaj for each layer of the bias value. We want to transform the output of each layer into a non-linear conversion, as follows:

The current layer output is ij,f as a non-linear conversion function, also known as an activation function, defined as follows:

That is, the output of each layer is:

This allows the output value of each layer to be obtained by entering the value forward.

2, according to the error reverse transmission for the output layer: where TK is the true value, OK is the predicted value

For hidden layers:

Weight update: Where L is the learning rate

Favor Update:

3. Termination conditions

The emphasis on the update is lower than a certain threshold value;
The predicted error rate is lower than a certain threshold value;
To achieve a predetermined number of cycles;

4. Nonlinear transformation function

The above mentioned nonlinear conversion function f, in general, can be used in two kinds of functions:

(1) tanh (x) function:

Tanh (x) =sinh (x)/cosh (x)
Sinh (x) = (exp (x)-exp (-X))/2
Cosh (x) = (exp (x) +exp (-X))/2

(2) Logical function, which is used in this paper is the logical function

The python implementation of BP neural network

You need to import the NumPy module first

Import NumPy as NP

Defining a nonlinear conversion function, as it is also necessary to use the derivative form of the function, defines

def tanh (x):  return Np.tanh (x) def tanh_deriv (x):  return 1.0-np.tanh (x) *np.tanh (x) def logistic (x):  Return 1/(1 + np.exp (-X)) def logistic_derivative (x):  return Logistic (x) * (1-logistic (x))

The design of the BP Neural network (several layers, the number of units per layer), the use of object-oriented, mainly to choose which nonlinear function, as well as the initialization of weights. Layers is a list that contains the number of units in each layer.

Class Neuralnetwork:  def __init__ (self, layers, activation= ' tanh '): ""    "    :p Aram layers:a list containing The number of units in each layer.    Should is at least the values    :p Aram activation:the activation function to be used. Can be    "logistic" or "Tanh" "" "    if activation = = ' Logistic ':      self.activation = Logistic      Self.activation_deriv = logistic_derivative    elif activation = = ' Tanh ':      self.activation = Tanh      Self.activation_deriv = Tanh_deriv     self.weights = []    for I in range (1, Len (layers)-1):      Self.weights.append ((2*np.random.random ((layers[i-1] + 1, layers[i] + 1))-1) *0.25      ( Np.random.random ((Layers[i] + 1, layers[i + 1]))-1) *0.25)

Implementation algorithm

  def fit (self, x, y, learning_rate=0.2, epochs=10000):    X = np.atleast_2d (x)    temp = Np.ones ([x.shape[0], x.shape[1 ]+1])    temp[:, 0:-1] = x    x = temp    y = Np.array (y) for     K in range (epochs):      i = Np.random.randint (x.shape [0])      A = [X[i]] for       L in range (Len (self.weights)):        a.append (Self.activation (Np.dot (a[l), self.weights[l]))      error = Y[i]-a[-1]      deltas = [ERROR * SELF.ACTIVATION_DERIV (A[-1])] for       L in range (Len (a)-2, 0,-1): 
  deltas.append (Deltas[-1].dot (self.weights[l). T) *self.activation_deriv (A[l])      deltas.reverse () for       I in range (len (self.weights)):        layer = Np.atleast_2d (A[i])        delta = np.atleast_2d (Deltas[i])        self.weights[i] + = learning_rate * layer. T.dot (Delta)

Implementing predictions

  Def predict (self, x):    x = Np.array (x)    temp = Np.ones (x.shape[0]+1)    temp[0:-1] = x    a = temp for    l in RA Nge (0, Len (self.weights)):      a = Self.activation (Np.dot (A, self.weights[l]))    return a

We give a set of numbers to make predictions, and the program files above us save the name BP

From BP import Neuralnetworkimport numpy as np nn = Neuralnetwork ([2,2,1], ' tanh ') x = Np.array ([[0,0], [0,1], [1,0], [all] ]) y = Np.array ([1,0,0,1]) Nn.fit (x,y,0.1,10000) for i in [[0,0], [0,1], [1,0], [[]]:  print (i, nn.predict (i))

The results are as follows:

([0, 0], array ([0.99738862])) ([0, 1], array ([0.00091329])) ([1, 0], array ([0.00086846])) ([1, 1], array ([0.99751259]))

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Detailed Neural network theory Foundation and Python implementation method

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support