1. Common activation functions
The selection of activation function is an important link in the process of constructing neural network, the following is a brief introduction to the commonly used activation functions.
(1) linear functions (Liner function)
(2) bevel functions (Ramp function)
(3) threshold Functions (Threshold function)
The above 3 activation functions are linear functions, and the following is a description of two commonly used nonlinear activation functions.
(4) S - shape Functions (Sigmoid function)
The function's Guide function:
(5) bipolar S - shape function
The function's Guide function:
The image of the S-shape function and the bipolar S-shape function is as follows:
Figure 3. S-shape function and bipolar s-shape function image
The main difference between the S-shape function and the S-shape function is the value range of the function, the value of the bipolar S-shape function is ( -1,1), and the S-shape function domain is (0,1).
Because the S-shape function and the bipolar S-shape function are both conductive (the derivative function is a continuous function), it is suitable for use in the BP neural network. (BP algorithm requires activation function to be guided)
2. Data preprocessing
Before training the neural network, it is necessary to preprocess the data, and an important preprocessing method is normalization processing. The following is a brief introduction to the principle and method of normalization processing.
(1) What is normalization?
Data normalization is the mapping of data to [0,1] or [ -1,1] intervals or smaller intervals, such as (0.1,0.9).
(2) Why should normalization be processed?
<1> input data units are not the same, some of the data may be particularly large, resulting in a slow convergence of neural networks, long training time.
Input with a large range of <2> data can be too large in the pattern classification, and the input with small data range may be small.
<3> because the value of the activation function of the neural network output layer is limited, it is necessary to map the target data of the network training to the domain of the activation function. For example, if the output layer of the neural network uses the S-shape activation function, because the value of the S-shape function is limited to (0,1), that is, the output of the neural network can only be limited to (0,1), so the output of the training data will be normalized to the [0,1] interval.
The <4>s activation function is flat outside the (0,1) interval, and the sensitivity is too small. For example the S-shape function f (X) when the parameter a=1, F (100) and F (5) are only 0.0067.
(3) normalization algorithm
A simple and fast normalization algorithm is a linear conversion algorithm. There are two common forms of linear conversion algorithms:
<1>y = (x-min)/(Max-min)
where min is the minimum value of x, Max is the maximum value of x, the input vector is x, and the normalized output vector is Y. The data is normalized to the [0, 1] interval, which is applied when the activation function takes the S-shape function (the range is (0,1)).
<2>y = 2 * (x-min)/(max-min)-1
This formula normalized the data to the [-1, 1] interval. This applies when the activation function takes a bipolar S-shape function (the range is ( -1,1)).
(4) Matlab Data Normalization processing function
In MATLAB, the normalized processing data can be used Premnmx, Postmnmx, tramnmx 3 of these functions.
<1> Premnmx
Syntax: [PN,MINP,MAXP,TN,MINT,MAXT] = Premnmx (p,t)
Parameters:
Matrix of Pn:p matrix by row Normalization
Minp,maxp:p minimum, maximum value for each row of the matrix
Matrix of Tn:t matrix by row Normalization
Mint,maxt:t minimum, maximum value for each row of the matrix
function: The matrix p,t normalized to [ -1,1], mainly used for normalization processing training data set.
<2> Tramnmx
Syntax: [PN] = Tramnmx (P,MINP,MAXP)
Parameters:
Minp,maxp:premnmx function calculates the minimum and maximum value of a matrix
PN: Normalized matrix
Function: Mainly used for normalization of the input data to be classified.
<3> Postmnmx
Syntax: [p,t] = Postmnmx (PN,MINP,MAXP,TN,MINT,MAXT)
Parameters:
The MINP,MAXP:PREMNMX function calculates the minimum value, maximum value, of the P-matrix per row
The MINT,MAXT:PREMNMX function calculates the minimum value, maximum value, of the T-matrix per row
Function: The scope of the matrix Pn,tn map before the regression of the processing. The POSTMNMX function is mainly used to map the output of neural networks to the data range before regression.
3. Using MATLAB to implement neural networks
Using MATLAB to establish a feedforward neural network will mainly use the following 3 functions:
NEWFF: Feedforward Network creation function
Train: Training A neural network
SIM: Using the network for emulation
The following is a brief introduction to the use of these 3 functions.
(1) newff function
<1>NEWFF function Syntax
The NEWFF function parameter list has a number of optional parameters, which can be referenced in MATLAB's help documentation, which describes a simple form of the NEWFF function.
Syntax: NET = NEWFF (A, B, {C}, ' Trainfun ')
Parameters:
A: A nx2 matrix, the first row element is the minimum and maximum value of the input signal XI;
B: A K-Koriyuki vector, whose elements are the number of nodes in the network;
C: A k-dimensional string line vector, each component is the corresponding layer neuron activation function ;
Trainfun: The training algorithm used for learning rules.
<2> Common activation Functions
The usual activation functions are:
a) linear functions (Linear transfer function)
f (x) = X
The string for the function is ' Purelin '.
b) logarithmic S-shaped transfer functions (logarithmic sigmoid transfer function)
The string for the function is ' logsig '.
c) hyperbolic tangent S-shape function (hyperbolic tangent sigmoid transfer function)
This is the bipolar S-shape function mentioned above. The string for the function is ' tansig '.
The Toolbox\nnet\nnet\nntransfer subdirectory in the installation directory of MATLAB has a definition description of all activation functions.
<3> Common training Functions
The common training functions are:
Traingd: Gradient Descent bp training function (Gradient descent backpropagation)
TRAINGDX: Gradient Descent adaptive learning rate training function
<4> Network configuration Parameters
Some important network configuration parameters are as follows:
Net.trainparam.goal: Target error of neural network training
Net.trainparam.show: Shows the period of intermediate results
Net.trainparam.epochs: Maximum number of iterations
NET.TRAINPARAM.LR: Learning Rate
(2) Train function
Network Training learning function.
Syntax: [NET, tr, Y1, E] = train (NET, X, Y)
Parameters:
X: Network actual input
Y: Network should have output
TR: Training Tracking information
Y1: Network actual output
E: Error Matrix
(3) Sim function
Syntax: Y=sim (NET,X)
Parameters:
NET: Network
X: Input to the network Kxn matrix, where K is the number of network inputs, n is the number of data samples
Y: Output matrix qxn, where Q is the number of network outputs
(4) Matlab BP Network example
I divided the iris dataset into 2 groups of 75 samples each with 25 samples per flower in each group. One group served as a training sample for the above procedure and the other as a test sample. For the convenience of training, 3 categories of flowers are numbered as three-to-three.
Use this data to train a 4 input (corresponding to 4 characteristics respectively), 3 output (respectively, to the probability that the sample belongs to a variety of the potential size) of the forward network.
The MATLAB program is as follows:
%reading training data [F1,F2,F3,F4,class] = Textread ('TrainData.txt','%f%f%f%f%f', 150); % Note Data format, the data between the comma and the space between the points;%Normalization of eigenvalues [Input,mini,maxi]= Premnmx ([F1, F2, F3, F4]') ;%construct the output matrix S= Length (class); output= Zeros (S, 3 ) ; fori = 1: s output (i,class(i)) = 1; end%Create a neural network net= NEWFF (Minmax (input), [10 3], {'Logsig' 'Purelin'} ,'Traingdx' ) ; %Set Training parameters net.trainparam.show= 50; Net.trainparam.epochs= 500; Net.trainparam.goal= 0.01; NET.TRAINPARAM.LR= 0.01 ;%Start Training Net= Train (NET, input, output' ) ;%reading test data [T1 T2 t3 T4 c]= Textread ('TestData.txt','%f%f%f%f%f', 150);%Normalization of test data testinput= Tramnmx ([T1,T2,T3,T4]', MinI, MaxI);%emulation y=Sim (NET, testinput)%statistical recognition accuracy [S1, S2]=size (Y); Hitnum=0; fori = 1: S2 [M, Index]=Max (Y (:, i)); if(Index = =C (i)) Hitnum= Hitnum + 1 ; endendsprintf ('The recognition rate is%3.3f%%', * hitnum/s2)
Experimental results:
Other neural Networks:
%produces a sample point of the specified category and plots x in the diagram= [0 1; 0 1]; %limiting the scope of a class center clusters= 5; %Specify the number of categories points= 10; %Specify the number of points per class Std_dev= 0.05; %standard deviation p for each class=Nngenc (X,clusters,points,std_dev);p lot (P (1,:), P (2,:),'+r'); Title ('Input Sample Vector'); Xlabel ('p (1)'); Ylabel ('P (2)');%Establish network net=NEWC ([0 1;0 1],5,0.1); %set the number of neurons to 5%get the network weights and plot the Figure;plot (P (1,:), P (2,:),'+r'); W=net.iw{1}hold On;plot (w (:,1), W (:, 2),'ob'); hold Off;title ('input Sample vectors and initial weights'); Xlabel ('p (1)'); Ylabel ('P (2)'); Figure;plot (P (1,:), P (2,:),'+r'); hold on;%Training Network Net.trainParam.epochs=7; NET=init (NET); NET=train (net,p);%get the network weights after training and draw W on the graph=net.iw{1}plot (W (:,1), W (:, 2),'ob'); hold Off;title ('input Sample vectors and updated weights'); Xlabel ('p (1)'); Ylabel ('P (2)'); a=0;p= [0.6; 0.8];a=Sim (net,p)%************** Specifies the input two-dimensional vector and its category *******************P= [-3-2-2 0 0 0 0 +2 +2 +3; 0+1-1 +2 +1-1-2 +1-10]; C= [1 1 1 2 2 2 2 1 1 1];%convert these classes into vector quantization networks using the target vector t=Ind2vec (C)%use different colors to draw these input vectors plotvec (p,c), title ('input two-dimensional vector'); Xlabel ('P (1)'); Ylabel ('P (2)');%Establish network net= Newlvq (Minmax (P), 4,[.6.4],0.1);%the input vector and the initial weight vector Figure;plotvec (p,c) hold onW1 are plotted on the same image.=net.iw{1};p Lot (W1 ((W1),'ow') title ('input and weight vectors'); Xlabel ('P (1), W (1)'); Ylabel ('P (2), W (2)'); hold off;%train the network and once again draw the weight vector Figure;plotvec (p,c); hold On;net.trainparam.epochs=150; Net.trainParam.show=inf;net=Train (net,p,t);p Lotvec (net.iw{1}', Vec2ind (net.lw{2}),'O');%for a specific point, get the output of the network P= [0.8; 0.3];a=Vec2ind (SIM (net,p))%%%%%%%%%********** randomly generates 1000 two-dimensional vectors as samples and plots their distributions *************P= Rands (2,1000);p lot (P (1,:), P (2,:),'+r') title ('initial random sample point distribution'); Xlabel ('P (1)'); Ylabel ('P (2)');%establish the network and get the initial weight net=newsom ([0 1; 0 1],[5 6]); W1_init=net.iw{1,1}%plot the initial weight distribution graph Figure;plotsom (w1_init,net.layers{1}.distances)%according to the different step length, training network, the corresponding weight distribution map is plotted . fori=10:30:100Net.trainParam.epochs=i; NET=train (NET,P); Figure Plotsom (net.iw{1,1},net.layers{1}.distances) End%for a well-trained network, select a specific input vector to get the output of the network P=[0.5;0.3];a=0;a= Sim (net,p)
Practical Cognition--ann