Pattern recognition classifier Learning (2)

Source: Internet
Author: User

There are many contents about the geometric Classifier in the book, including linear classifier and non-linear classifier. Linear classifiers include sensor algorithms, incremental correction algorithms, lmse classification algorithms, and Fisher classification. Non-linear classifiers include the potential function method.

As soon as I saw so many geometric classifiers, I was dizzy and only looked at the sensor algorithm. So I only write down the sensor algorithm. Neural Networks, the book only introduces three layers of BP neural networks, so I only record BP neural networks.

 

Iii. Geometric Classifier

Suppose there are M-class models W1, W2,..., WM. For M categories of D-dimension space, we need to provide M discriminant functions: D1 (x), D2 (x),..., D3 (X ). If X belongs to Class I

Di (x)> DJ (x) j = 1, 2,..., M; I! = J.

 

1. Sensor algorithm (Rewards and Punishments)

The maximum value of the discriminant function is used for training. For M classes, M discriminant functions are available.

M class sample W1, W2,..., WM, and sample X (k) in K this iteration are provided. If sensor algorithm is used, M class functions are used.

All should be calculated.

① For di [x (k)]> DJ [x (k)], j = 1, 2,..., M, J! = I, the weight vector does not need to be corrected.

② If di [x (k)] <= Dj [x (k)] is used, the weight vector is corrected as follows:

C is a corrected parameter, which can be obtained at will, for example, 1, 0.5.

Tutorial steps:

Training and learning:

① Set the initial value of each weight vector to 0, that is, w0 = W1 = W2 =... = WM =0,(WM (wm1, wm2,..., WMN ));

② Extend the sample features and W in the sample database to the form of increasing 1 (WM (wm1, wm2,..., WMN, 1 )). Set I = 1, k = 1

③ Input the K samples X (x1, x2,..., xn) of the I-th item into each Discriminant Function di (X ).

④ If di (x) <DJ (x), k = 1, 2,..., M, I! exists! = J

Set all

Wj (x) = Wj (x)-C * X.

WI (x) = wi (x) + C * x

  1. If (the current class does not have the next sample)
  2. {
  3. If (NO)
  4. Turn ⑤;
  5. Else
  6. {
  7. I ++;
  8. K = 1;
  9. To ③;
  10. }
  11. }
  12. Else
  13. {
  14. K ++;
  15. To ③;
  16. }

⑤ I = 1, k = 1, repeated ③ until each wi no longer changes or to the specified number of repetitions

Recognition:

Extended the samples to be tested into a matrix of 1 and substituted into each discriminant function. The largest function value is the classification of the samples to be tested.

Geometric identification, the sample does not need too much.

 

Iv. Neural Network Classifier

Artificial neurons are equivalent to a flying linear threshold device with multiple inputs and single outputs. Here, x1, x2,..., xn indicates its N inputs. W1, W2,..., Wn indicates the weight. Σ wixi becomes the activation value.

O = f (Σ wixi-θ)

For the output of this artificial neuron, θ is the threshold of this artificial neuron. If the input activation value is greater than this threshold, the artificial neuron is activated.

F is called an activation function.

Common activation functions can be categorized into three forms:

1) threshold function

2) Sigmoid Function (S-type function, the most common activation function in Artificial Neural Network). A is the slope parameter of the sigmoid function. You can obtain sigmoid functions with different slopes by changing parameter. When the slope parameter is close to infinity, this function is converted to a simple threshold function, but the sigmoid function corresponds to a continuous region from 0 to 1, and the threshold function is two points, 1. Sigmoid is micro and the threshold value is not)

3) piecewise linear functions

The neural network structure is divided into two main categories: layered and interconnected.

A layered neural network is divided into several layers by function, which generally include an input layer, an intermediate layer, and an output layer. There are three layers: A simple forward Network, a forward network with feedback, and a forward network with mutual connection in the layer. BP network is a typical forward network. The BP network is trained by instructors. When a pair of learning modes are provided to the network, the activation value of the neuron will be transmitted from the input layer to the output layer through the intermediate layers, each neuron in the output layer outputs a network response corresponding to the input mode. Based on the principle of reducing the expected output and the actual output error, the connection weights are modified layer by layer from the output layer through the middle layer and finally back to the input layer. This is an error inverse propagation algorithm. The BP network learning process consists of the following four parts.

① Input mode propagation (input mode is calculated from the input layer to the output layer through the intermediate layer ).

② Inverse propagation of output error (the output error is transmitted from the output layer to the input layer through the intermediate layer ).

③ Cyclic Memory Training (repeated ① ② n times)

④ Learning result discrimination (determining whether global error tends to be a minimum value)

Perform the following training for each sample of each category.

(If you perform the first training of the network, you need to input the weights wij, WJT, threshold θ J, θ T, learning coefficient α, and beta initial values. Generally, a random number is obtained between (-1, 1) or [-2.4/F, 2.4/F]. The threshold value is 0.01-0.8, and the learning coefficient is 0. <α, β <1 .)
1. input mode propagation

① Determine the input vector (that is, the feature of the learning sample)

N is the number of units in the input layer (feature number)

② Determine the expected output vector (that is, the output feature. I cannot understand it clearly. If you output numbers to it, then you can set it to every digit in the binary form of A number)

Q is the number of units at the output layer.

③ Calculate the activation value of each neuron in the middle layer SJ;

Wij is the connection weight from the input layer to the intermediate layer, Xi is the sample feature value, p is the number of intermediate layer units, and θ J is the threshold value for each unit in the intermediate layer.

S-type function for activating a function

④ Calculate the output value of Unit J in the middle layer.

θ J will also be constantly corrected during the learning process

⑤ Calculate the activation value of unit t at the output layer ot

⑥ Calculate the CT of the actual output value of unit t at the output layer.

 

WJT is the weight from the intermediate layer to the output layer, θ T is the unit threshold of the output layer, F is the S-form activation function, and Q is the number of units in the output layer.

2. Inverse propagation of output errors

① The correction error of the output layer is

T = 1, 2,..., Q (Q is the number of units in the output layer), yt is the expected output, CT is the actual output, and F' is the derivative of the function in the output layer.

② The correction error of the intermediate layer is

J = 1, 2,..., P (P number of intermediate layer units)

③ Correction of the connection right from the output layer to the intermediate layer and the threshold value of the output layer

α is the learning coefficient.

④ The amount of correction from the intermediate layer to the input layer is

Beta is the learning coefficient.
Note:

(1) number of layers of the network. A layer-3 network can realize multi-dimensional ing, that is, it can approach any rational function. This actually gives a basic principle for designing a BP network. To increase the accuracy, you can add neurons and trees in the middle layer to increase the number of layers. Generally, we should first consider increasing the number of neurons in the middle layer.

(2) number of neurons in the middle layer. Increasing the number of neurons in the middle layer can improve the accuracy, but how much does it increase? The actual practice is to set the number of neurons in the middle layer to double the number of neurons in the input layer, and add some margin as appropriate.

(3) Select an initial value. If you perform the first training of the network, you need to input the weights wij, WJT, threshold θ J, θ T, learning coefficient α, and beta initial values. Generally, a random number is obtained between (-1, 1) or [-2.4/F, 2.4/F]. The threshold value is 0.01-0.8, and the learning coefficient is 0. <α, β <1.

 

Ah ~ Finally, I wrote the classification ........ when I almost finished writing, I accidentally pressed a QQ button and popped up the webpage I was about to write to another website, causing me to rewrite it once !!!!! Pain! However, I am in a good mood after writing it. If there is any error, please point it out. Thank you.

 

Please leave a message and enter it. It is not easy to write something. Thank you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.