Statistical Learning Method notes <Chapter 2 perception machine>

Last Update:2014-08-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Chapter 2 perception Machine

I feel that the perception machine is still very simple. Just write it.

Perceptron is a binary linear classifier.

Input X indicates the feature vector of the instance, and output y indicates the instance category, which is represented by the following function:

Where W is the weight (weight) or the weight vector (weight vector), B indicates the bias (bias), sign is the symbol function, and the content greater than 0 is 1, otherwise it is-1.

The sensor belongs to the discriminant model (the ing function from input to output is directly searched, and does not care about the joint probability or something ).

Explanation of the sensor: WX + B = 0 corresponds to a hyperplane s in the feature space (the hyperplane is represented as a line in two dimensions and the plane in three dimensions, I don't know if it's a ghost. It's called a superplane. It's more than a plane ). W is the normal vector, and B is the intercept (it can be understood from a straight line in the European space ).

Linear division and linear division: that is, for a dataset, if such a hyperplane wx + B = 0 can be found, the positive and negative instance points can be completely divided to both sides of the superplane, it is called linear differentiation, otherwise it cannot be divided.

The learning goal of a sensor is to find a super plane that can completely correctly classify Positive and Negative instance points (this does not mean that the sensor can only be used on Linearly partitioned data sets (t_t )), then... find the matching W and B. The learning strategy is to minimize the loss function.

What is the loss function? We often first think of the number of misclassification! I feel good, but the number seems to be of little help to the adjustment of W and B. How can I adjust W and B to reduce the number of misclassified items? It seems difficult, seriously, this loss function is not a continuous bootable function of W and B. So, the wise ancestor thought of the total distance from the mistaken classification point to the ultra-plane, so it's easy to forget that the total distance was written as this thing:

Why is there a negative number, because for mistakenly classified data (x, y), there is-y (wx + B)> 0 (Buddha said: Too lazy to say ). Then there is a loss function (proving something to die ):

Then the loss function is minimized (-_-zzz ):

The perception machine learning algorithm is drive by mistake (the word "driven" sounds very powerful), and the Stochastic Gradient Descent Method (which will be written later ), evaluate the skewness for W and B respectively:

After the request is completed, it will be updated (I feel like there is nothing to say here, and I will see it at a Glance ):

What I really want to talk about with N is the learning rate, which increases the speed of the parameter, but is easy to run too much. It is easy to run too slowly, but it is easy to be too slow.

Then there is the complete algorithm (too lazy to hit, directly steal the image ):

Then there are various examples behind the book. Proof of algorithm convergence, dual form or something. If you are too lazy to write, you can understand this.

Well, this is probably the case. This chapter is relatively simple. First try the water.

Statistical Learning Method notes <Chapter 2 perception machine>

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Statistical Learning Method notes <Chapter 2 perception machine>

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Statistical Learning Method notes <Chapter 2 perception machine>

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support