"Matlab Neural network Programming" Chemical Industry Press book notes
Fourth. Forward-type neural network 4.2 linear neural network
This article is "MATLAB Neural network Programming" book reading notes, which involves the source code, formulas, principles are from this book, if there is no understanding of the place please refer to the original book.
First, define some of the commonly used parameter meanings in neural networks, as shown below:
first, the construction of a linear neural network.
1, a linear neuron is generated.
Construct a single-neuron linear network with two inputs as shown in the following figure.
Its weight matrix w is a row vector, and the network output A is:
A=purelin (n) =purelin (wp+b) =wp+B
Or:
A=w1,1*p1+w1,2*p2+b
Like the perceptron, the linear Neural network also has a dividing line, determined by the input vector, that is, when n=0, the equation WP+b=0, the classification diagram is as follows:
When the input vector is at the top right of the dividing line, the output is greater than 0, and the input vector is lower than 0. Thus, the linear neural network can be used to study classification problems , of course, the premise is that the problem of classification is linearly divided, which is the same as the limitations of perceptron.
"Example 4-13" application Newlin design a dual-input single-output linear neural network, the input vector range is [-1 1;-1 1], the learning rate is 1.
The source code is as follows:
Clear all
Net=newlin ([-1 1;-1 1],1); % the network weights and thresholds are now defaulted to 0
w=net. iw{1,1} %w=0 0
b=net.b{1} %b=0
net. Iw{1,1}=[2 3]; % set the weights and thresholds of the network respectively
Net.b{1}=[-4];
P=[5;6]; %p is the input vector
A=sim (net,p) % using the network established above to determine p corresponding output a
The output is:
2, Linear filter.
The first is the trigger delay line applied to the linear neural network (this is what I am not understanding).
If a trigger delay line is applied to a linear neural network, the output of the resulting linear filter is:
Such a network can be applied to signal processing filtering.
"Example 4-14" assumes input vector p, expected output vector T, and initial input delay P1.
The source code is as follows:
Clear All
p={1 2 1 3 3 2};
P1={1 3};
T={5 6 4 7 8};
% apply the Newlind function to construct a network to satisfy the above input/output relationships and delay conditions
net=newlind (P,T,P1);
Y=sim (NET,P,P1) % verifies the output of the network.
The output is:
two, network training.
The network training process for adaptive linear components can be summed up in three steps:
(1), expression. Calculates the output vector a=wp+b of the training and the error e=t-a between the desired output.
(2), check. The sum of squares of the output error of the network is compared with the expected error, if the value is less than the expected error, or if the training has reached the maximum training times set beforehand, the training is stopped, otherwise it will continue.
(3), study. Use W-h learning rules to calculate new weights and deviations and return to the first step.
Each of the above three steps is considered to be the completion of a training cycle.
If the network still fails to achieve the desired goal after training, there are two options: one is to check the problem to be solved, whether it is suitable for linear networks, and the other is to train the network further.
Although only applicable to linear networks, the w-h learning rule is still important because it shows how gradient descent is used to train a network, which later evolved into a reflection propagation method that enables the training of multilayer nonlinear networks.
"Example 4-15" considers the design problem of a large neuron network pattern association. The input vector and target vectors are:
P=[1 1.5 1.2-0.3;-1 2 3-0.5;2 1-1.6 0.9];
t=[0.5 3-2.2 1.4;1.1-1.2 1.7-0.4;3 0.2-1.8-0.4;-1 0.4-1.0 0.6];
This problem can be solved by the way of linear equations, which is more complicated, and the solution with some errors can be obtained by means of the adaptive Linear Network.
Source:
Clear all;
P=[1 1.5 1.2-0.3;-1 2 3-0.5;2 1-1.6 0.9];
t=[0.5 3-2.2 1.4;1.1-1.2 1.7-0.4;3 0.2-1.8-0.4;-1 0.4-1.0 0.6];
[S,q]=size (T);
max_epoch=400; % maximum number of training
err_goal=100;
err_goal=0.001;
LR=0.9*MAXLINLR (P);
% Initial weight
w0=[1.9978-0.5959-0.3517;1.5543 0.05331 1.3660;
1.0672 0.3645-0.9227; -0.7747 1.3839-0.3384];
B0=[0.0746;-0.0642;-0.4256;-0.6433];
Net=newlin (Minmax (P), S,[0],LR);
Net. iw{1,1}=w0;
net.b{1}=b0;
A=sim (net,p);
e=t-a;
Sse= (SUMSQR (e))/(S*Q); % error squared sum of Mean
fprintf (' before Traing,sum squrared error=%g.\n ', SSE);% shows network mean variance before training
net.trainparam.epochs= ; % maximum number of cycles
net.trainparam.goal=0.001; % expected error (mean variance)
[Net,tr]=train (net,p,t);
w=net.iw{1,1}
b=net.b{1} % shows the final weight
Results of the output:
For an accurate weight network with 0 errors, if the function newlind to solve, it is more simple, the above example can be used with the source code as follows:
Clear all;
P=[1 1.5 1.2-0.3;-1 2 3-0.5;2 1-1.6 0.9];
t=[0.5 3-2.2 1.4;1.1-1.2 1.7-0.4;3 0.2-1.8-0.4;-1 0.4-1.0 0.6];
[S,q]=size (T);
B=[];
W=[];
A=[];
For i=1:s
Net=newlind (p,t (i,:)); % Design a linear network with a single line vector
w=[w;net.iw{1,1}]
b=[b;net.b{1}]
A=[a;sim (net,p)]
end
W % Output complete deviation and weight
b
A % display the final output of the network
Because Newlind constructs and designs a linear network in the form of a sequence of vectors (lines), the design of a multi-output neural network requires multiple loops to obtain the final result.
It is often possible to determine directly whether a linear network has a perfect 0-error solution. If the degrees of freedom (weights and thresholds) of each neuron are equal to or greater than the limit number (i.e., the input or output vector pair) "can be considered as a formula: R+1>=q", then the linear network can solve the problem with zero error. However, this conclusion is not valid when the input vector is linearly correlated or there is no threshold value.