Neural network for "reprint"

Last Update:2015-12-24 Source: Internet

Author: User

Tags neural net

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Data preprocessing

before training the neural network, it is necessary to preprocess the data, and an important preprocessing method is normalization processing. The following is a brief introduction to the principle and method of normalization processing.

(1) What is normalization?

Data normalization is the mapping of data to [0,1] or [ -1,1] intervals or smaller intervals, such as (0.1,0.9).

(2) Why should normalization be processed?

<1> input data units are not the same, some of the data may be particularly large, resulting in a slow convergence of neural networks, long training time.

Input with a large range of <2> data can be too large in the pattern classification, and the input with small data range may be small.

<3> because the value of the activation function of the neural network output layer is limited, it is necessary to map the target data of the network training to the domain of the activation function. For example, if the output layer of the neural network uses the S-shape activation function, because the value of the S-shape function is limited to (0,1), that is, the output of the neural network can only be limited to (0,1), so the output of the training data will be normalized to the [0,1] interval.

The <4>s activation function is flat outside the (0,1) interval, and the sensitivity is too small. For example the S-shape function f (X) when the parameter a=1, F (100) and F (5) are only 0.0067.

(3) Normalization algorithm

A simple and fast normalization algorithm is a linear conversion algorithm. There are two common forms of linear conversion algorithms:

<1>

y = (x-min)/(Max-min)

where min is the minimum value of x, Max is the maximum value of x, the input vector is x, and the normalized output vector is Y. The data is normalized to the [0, 1] interval, which is applied when the activation function takes the S-shape function (the range is (0,1)).

<2>

y = 2 * (x-min)/(max-min)-1

This formula normalized the data to the [-1, 1] interval. This applies when the activation function takes a bipolar S-shape function (the range is ( -1,1)).

(4) MATLAB data normalization processing function

In MATLAB, the normalized processing data can be used Premnmx, Postmnmx, tramnmx 3 of these functions.

<1> Premnmx

Syntax: [PN,MINP,MAXP,TN,MINT,MAXT] = Premnmx (p,t)

Parameters:

Matrix of Pn:p matrix by row Normalization

Minp,maxp:p minimum, maximum value for each row of the matrix

Matrix of Tn:t matrix by row Normalization

Mint,maxt:t minimum, maximum value for each row of the matrix

function: The matrix p,t normalized to [ -1,1], mainly used for normalization processing training data set.

<2> Tramnmx

Syntax: [PN] = Tramnmx (P,MINP,MAXP)

Parameters:

Minp,maxp:premnmx function calculates the minimum and maximum value of a matrix

PN: Normalized matrix

Function: Mainly used for normalization of the input data to be classified.

<3> Postmnmx

Syntax: [p,t] = Postmnmx (PN,MINP,MAXP,TN,MINT,MAXT)

Parameters:

The MINP,MAXP:PREMNMX function calculates the minimum value, maximum value, of the P-matrix per row

The MINT,MAXT:PREMNMX function calculates the minimum value, maximum value, of the T-matrix per row

Function: The scope of the matrix Pn,tn map before the regression of the processing. The POSTMNMX function is mainly used to map the output of neural networks to the data range before regression.

2. Using MATLAB to implement neural networks

Using MATLAB to establish a feedforward neural network will mainly use the following 3 functions:

NEWFF: Feedforward Network creation function

Train: Training A neural network

SIM: Using the network for emulation

The following is a brief introduction to the use of these 3 functions.

(1) NEWFF function

<1>NEWFF function syntax

The newff function parameter list has a number of optional parameters, which can be referenced in MATLAB's help documentation, which describes a simple form of the NEWFF function.

Syntax: NET = NEWFF (A, B, {C}, ' Trainfun ')

Parameters:

A: A nx2 matrix, the first row element is the minimum and maximum value of the input signal XI;

B: A K-Koriyuki vector, whose elements are the number of nodes in the network;

C: A k-dimensional string line vector, each component is the corresponding layer neuron activation function ;

Trainfun: The training algorithm used for learning rules.

<2> Common activation functions

The usual activation functions are:

a) linear functions (Linear transfer function)

f (x) = X

The string for the function is ' Purelin '.

b) Logarithmic S-shaped transfer functions (logarithmic sigmoid transfer function)

the string for the function is ' logsig '.

c) Hyperbolic tangent S-shape function (hyperbolic tangent sigmoid transfer function)

This is the bipolar S-shape function mentioned above.

The string for the function is ' tansig '.

The Toolbox\nnet\nnet\nntransfer subdirectory in the installation directory of MATLAB has a definition description of all activation functions.

<3> Common training functions

the common training functions are:

Traingd: Gradient Descent bp training function (Gradient descent backpropagation)

TRAINGDX: Gradient Descent adaptive learning rate training function

<4> Network Configuration parameters

Some important network configuration parameters are as follows:

Net.trainparam.goal: Target error of neural network training

Net.trainparam.show: Shows the period of intermediate results

Net.trainparam.epochs: Maximum number of iterations

NET.TRAINPARAM.LR : Learning rate

(2) Train function

Network Training learning function.

Syntax: [NET, tr, Y1, E] = Train (NET, X, Y)

Parameters:

X: Network actual input

Y: Network should have output

TR: Training Tracking information

Y1: Network actual output

E: Error Matrix

(3) Sim function

Syntax: Y=sim (NET,X)

Parameters:

NET: Network

X: Input to the network Kxn matrix, where K is the number of network inputs, n is the number of data samples

Y: Output matrix qxn, where Q is the number of network outputs

(4) Matlab BP Network example

I divided the iris dataset into 2 groups of 75 samples each with 25 samples per flower in each group. One group served as a training sample for the above procedure and the other as a test sample. For the convenience of training, 3 categories of flowers are numbered as three-to-three.

Use this data to train a 4 input (corresponding to 4 characteristics respectively), 3 output (respectively, to the probability that the sample belongs to a variety of the potential size) of the forward network.

The MATLAB program is as follows:

% read training data [F1,f2,f3,f4,class] = Textread (' trainData.txt ', '%f%f%f%f%f ', 150);% eigenvalue normalization [Input,mini,maxi] = Premnmx ([F1, F2, F3, F4] ')  ;% construction output Matrix S = Length (class); output = Zeros (s, 3  ); For i = 1:s    output (I, Class (i)  ) = 1; end% Create a neural network Net = NEWFF (Minmax (Input), [3], {' Logsig ' Purelin '}, ' Traingdx '); % Set Training Parameters net.trainparam.show = Net.trainparam.epochs = Net.trainparam.goal = 0.01; net.trainParam.lr = 0.01;% start Training n  ET = train (NET, input, output ');% read test data [T1 T2 t3 t4 c] = textread (' testData.txt ', '%f%f%f%f%f ', 150);% Test Data Normalization Testinput  = Tramnmx ([T1,t2,t3,t4] ', MinI, MaxI);% emulation Y = SIM (NET, testinput)% statistical recognition accuracy [S1, s2] = size (Y); hitnum = 0; for I = 1:s2    [m, Index] = max (Y (:,  i));    if (Index  = = C (i)   )         hitnum = hitnum + 1;     endendsprintf (' Recognition rate is%3.3f%% ', ' hitnum/s2 ')

The recognition rate of the above procedures is stable at about 95%, training 100 times to achieve convergence, the training curve as shown:

Figure 9. Training performance

(5) Effect of parameter setting on neural network performance

in my experiment, by adjusting the number of nodes in the hidden layer, I chose the inactive activation function to set different learning rates .

<1> number of hidden layer nodes

The number of hidden layer nodes has little effect on the recognition rate, but the number of nodes increases the computational capacity and makes training slow.

<2> Selection of activation functions

The activation function has a significant effect on the recognition rate or the rate of convergence. The precision of S-shape function is much higher than that of linear function in the approximation of the higher curve, but the computational amount is much larger.

<3> Choice of learning rate

The learning rate affects the speed of network convergence and the convergence of networks. The learning rate setting is small to ensure the convergence of the network, but the convergence is relatively slow. On the contrary, the high learning rate setting may make the network training not convergent and affect the recognition effect.

3. Using aforge.net to implement neural networks

(1) Aforge.net Introduction

Aforge.net is a C # implementation of the open-source architecture for AI, computer vision and other fields. The Neuro directory under the Aforge.net source code contains a neural network class library.

Aforge.net Home: http://www.aforgenet.com/

Aforge.net Code Download: http://code.google.com/p/aforge/

The class diagram for the Aforge.neuro project is as follows:

Figure 10. Class diagram of Aforge.neuro class library

Here are a few of the basic classes in Figure 9:

Abstract base class for neuron-neurons

Abstract base class of layer-layer, consisting of multiple neurons

Abstract base class of network-neural network, composed of multiple layers (layer)

Iactivationfunction-Interface for activation functions (activation function)

Iunsupervisedlearning-Interface for tutorial-free learning (unsupervised learning) algorithm isupervisedlearning-Interface with tutor Learning (supervised learning) algorithm

(2) using Aforge to establish BP neural network

using Aforge to build a BP neural network will use the following classes:

<1> sigmoidfunction:s-shaped neural network

Constructor: Public sigmoidfunction (double alpha)

The parameter alpha determines the degree of steepness of the S-shaped function.

<2> activationnetwork: Neural network class

constructor function:

Public Activationnetwork (iactivationfunction function, int inputscount, params int[] neuronscount)

: Base (Inputscount, Neuronscount.length)

Public virtual double[] Compute (double[] input)

Parameter meaning:

Inputscount: Number of inputs

Neuronscount: Indicates the number of neurons in each layer

<3> BACKPROPAGATIONLEARNING:BP Learning Algorithm

constructor function:

Public backpropagationlearning (Activationnetwork network)

Parameter meaning:

Network: The neural net object to be trained

The Backpropagationlearning class requires a user-set property with the following 2:

Learningrate: Learning Rate

Momentum: Impulse Factor

Here is a code to build a BP network with Aforge.

//Create a multilayer neural network with an S-shaped activation function with 4,5,3 neurons in each layer(where 4 is the number of inputs, 3 is the number of outputs, and 5 is the number of middle-tier nodes) activationnetwork network =Newactivationnetwork (NewSigmoidfunction (2),4,5,3);//Create a training algorithm object backpropagationlearning teacher =New backpropagationlearning (network);/ / Set the learning rate and impulse coefficient of bp algorithm teacher.learningrate = 0.1; teacher. Momentum = 0; int iteration = 1 ;// iterative training 500 times while (Iteration < ) {teacher. Runepoch (Traininput, trainoutput); ++iteration; } //Using trained neural networks to classify, T is the input data vector network. Compute (t) [0]

The iris data is classified by the program, and the recognition rate can reach about 97%.

Neural network for "reprint"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More