"Perceptron Learning algorithm" Heights Tian Machine learning Cornerstone

Source: Internet
Author: User

Skip the first lecture directly. Starting with the second Perceptron, record some of the points in this lecture that are deeply impressed:

1. My intuition has always been bad for this kind of diagram, and always follow X, y to understand.

A) Each coordinate of this graph represents the value of features;features which is physically significant.

b) and the Circle and fork is to mark different samples (positive sample negative sample), that is, label; for a lot of easy to follow, here is a sample take +1, negative sample fetch-1

2. Geometric meaning of Perceptron learning strategy: normal vector rotation direction for critical lines (polygons)

Since label is set to +1 and-1, it is possible to use W+yx directly to indicate the rotation strategy of the critical line in the case of the wrong sub-sample, which is ingenious and concise.

Here is a question, if each time according to a point adjustment, can guarantee the adjustment after this point must be right?

I think the answer is no: when the wheel adjusts, the point is not necessarily right.

For example, if the W vector is particularly long, the X is particularly short, and the angle between W and X is particularly large, then there may be no guarantee that W (t+1) x is positive after w+yx (i.e. the angle y=+1);

But this does not affect the final overall convergence trend (if it is linear seperable)

3. Why is the algorithm strategy of Perceptron Learning algorithm convergent under the condition of linear seperable?

The idea of Lin is this:

A) First assume that the data is linear seperable, under which we think there is an ideal dividing line method vector wf

b) If we ask for W to be closer to WF, we think the better

c) How to measure W and wf closer? The larger the inner product of the vector, the closer it is assumed (the smaller the angle)

Based on the above ideas, we can get

To the effect that, according to the PLA algorithm strategy, can ensure that each round of WF and W's inner product is always more and more large, this will ensure that the algorithm in the direction of good progress.

But there is still a problem, the length of each round w is also changing Ah, so simple comparison of the inner product size of WF and w is meaningless.

Thus, further, the following derivation is made:

As for why we use the 2 norm here, I understand mainly for the sake of presentation convenience.

The meaning of such a big paragraph after each round of algorithm strategy iteration, we require the length of the W to increase the growth rate is capped. (Of course, it is not necessarily the growth of each round, if the middle of the expansion of the equation is relatively large negative, it may also decrease)

The above two ppt together to illustrate an intuitive problem: the algorithm strategy every round in a good direction, and W's growth rate is capped.

With such an intuitive understanding, we can guess that, within a certain number of iterations, the algorithm strategy can be convergent. That is, the following formula is shown:

The proof process courseware did not give, his own uniform row also came out:

Their words are too ugly, but this is relatively fast, make a look.

This proving process, the conditions relaxed are quite wide, but it can be proved that the PLA algorithm strategy is convergent.

"Perceptron Learning algorithm" Heights Tian Machine learning Cornerstone

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.