Machine learning-Hangyuan Li-Statistical Learning Method Learning Note perception Machine (2)

Source: Internet
Author: User

In machine learning-Hangyuan Li-The Perceptual Machine for learning notes (1) We already know the modeling of perceptron and its geometrical meaning. The relevant derivation is also explicitly deduced. Have a mathematical model. We are going to calculate the model.

The purpose of perceptual machine learning is to find a separate hyper plane that can completely separate positive and negative instances. That is to ask for the parameters W and B in the Perceptron model. The learning strategy is to define an empirical loss function and minimize the loss function. The Learning strategy we use here is to find the total distance of all the wrong classification points to the Super plane S. Assuming that the error classification point of the hyper Plane S is set to M, then the total distance of all the mis-classification points to the hyper-plane s is

Obviously the loss of function L (W,B) is non-negative, if there is no wrong classification point, then the value of loss functions is 0, because the definition of loss function is to find the wrong classification point to the plane distance, the wrong classification point is not, then the value of the loss function is definitely 0.

The Perceptual machine learning algorithm is driven by mis-classification and adopts random gradient descent method. First, arbitrarily select a super-planar w,b and then minimize the target function. The definitions are given in the author's book. Not a wordy.

The original form of perceptual machine learning algorithm

Make a detailed derivation of example 2.1. The author has actually given the derivation. It's enough for a lot of people who have a solid base of knowledge. But for some college years when the high number forget about the same we, the rationale for the author should also be carefully handwritten deduction.

Solution to build optimization problem:, according to algorithm 2.1 to solve w,b, learning η=1

Take the initial value w0= (0,0) t (here W0 is the initial normal vector, if the three-dimensional space should be (0,0,0) T, here the two-dimensional plane is sufficient w0= (0,0) T. So, w0= (0,0) T)

B0=0.

For x1= (3,3) T, because it is a positive classification point, the y1=1 is brought into the separate hyper-plane formula

Y1 (W0 x1+b0)

= 1 ((0,0) T. (3,3) t+0)--------Equation 1.0

where T is the transpose of the Matrix, which means (0,0) the vertical. Meanwhile here (0,0) T and (3,3) T are also the representations of vectors. The middle dot represents the inner product of two vectors. Let's take a look at the definition of the inner product of vectors

There is a clear definition of this in linear algebra. So the inner product of (0,0) T and (3,3) T is 0*3+0*3=0.

So the value of Equation 1.0 is 0. Since all positive and negative instances are to be separated, it is clear that the positive instance is not in accordance with the requirements in the separation of the super-plane. So we're going to update w,b.

w1=w0+y1x1 here the meaning of the updated W-normal vector is the direction of the moving separation of the super-plane, for the two-dimensional space is to change the slope of the line, update B is the intercept of the moving slash.

We'll start by representing a few instances here. x1y1= ((3,3) t,1) x2y2= ((4,3) t,1) x3y3= (() t,-1)

Obtain w1= (0,0) t+ (3,3) t= (3,3) T b1=b0+y1=1

So the linear model is

Because we use the function interval to measure whether it is correctly classified, that is, the linear model is preceded by the parameter Yi because the correct classification time yi=1, the wrong classification of the time Yi=-1, so can be the product of the two as long as more than 0 can represent the correct classification, do not need to update the function parameters. Less than or equal to 0 indicates that the parameter is being updated.

The new linear model for Point x1y1= ((3,3) t,1) x2y2= ((4,3) t,1) is clearly greater than 0, which can be correctly categorized. For x3y3= (t,-1), the post-function interval less than 0 means that the function is not properly categorized. So you need to update the function.

w2=w1+y1x1

For the general form of the perceptual machine solution, it is very simple, reading carefully, understanding a few mathematical concepts is easy to understand. Not to repeat.

Convergence of perceptual machine learning algorithm

In general, the feeling is not very important, it is not difficult to understand, maybe I did not manually deduce the reason. What you want to study can be directly read the author's deduction.

Dual form of perceptual machine learning algorithm

Here are the examples given in the author's book, but there is no specific derivation process.

We derive as follows. We can tell from the original form. The update process for W.

The first update is that the x1y1= ((3,3) t,1) point cannot be a function model greater than 0, so w1=w0+x1y1

The second update is that the x3y3= (() t,-1) point cannot make it greater than 0, so W2=w1+x3y3

The third update is that the x3y3= (() t,-1) point cannot make it greater than 0, so W3=w2+x3y3

The fourth update is that x3y3= (t,-1) point cannot make it greater than 0, so W4=w3+x3y3

The fifth update is that the x1y1= ((3,3) t,1) point cannot make it greater than 0, so w5=w4+x1y1

The sixth update is that x3y3= (t,-1) point cannot make it greater than 0, so W6=w5+x3y3

The seventh update is that x3y3= (t,-1) point cannot make it greater than 0, so W7=w6+x3y3

And then we get

From the above can be summed up w7=w6+x3y3

W7=w5+x3y3 +x3y3

W7=w4+x1y1+x3y3 +x3y3

W7=w3+x3y3+x1y1+x3y3 +x3y3

W7=w2+x3y3+x3y3+x1y1+x3y3 +x3y3

W7=w1+x3y3 +x3y3+x3y3+x1y1+x3y3 +x3y3

W7=w0+x1y1 +x3y3 +x3y3+x3y3+x1y1+x3y3 +x3y3

So we can conclude that the final W7 value is two times X1y1 + five times X3y3

It is equal to the same in the dual form.

Similarly can be obtained B, example 2.2 of the error conditions we can also be written in the following form.

The solution iterative process given by the author is compared from the above formula. We should be able to easily understand the dual-form of the perceptron algorithm, the derivation found just a simple calculation form. This is the end of the perceptual machine chapter in the statistical learning method.

can refer to machine learning-Hangyuan Li-Statistical Learning Method Learning Note perception Machine (1)

This address: http://www.cnblogs.com/santian/p/4351756.html

Blog Address: http://www.cnblogs.com/santian/

Reprint please indicate the original source of the article in the form of hyperlink.

Machine learning-Hangyuan Li-Statistical Learning Method Learning Note perception Machine (2)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.