Lesson6 thoughts on "machine learning" at Stanford ——— 1, function interval and geometric interval

Last Update:2015-07-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This lesson is mainly from how to judge a machine learning classification algorithm to fit the parameters is the best parameter extraction function interval and the definition of geometric interval.

1. Function interval

Assuming a hypothetical function, you know Y=1, and vice versa y=0. So when, we can be very sure that y=1, when, can be very certain to think of y=0. So in the classification algorithm, when we train the sample to get these two results, we can know that the selected parameters can well fit the data, can be confident that our classifier is consistent with data facts. So our data can elicit the definition of function interval.

Given a data case, the hypothetical function is (denoted by (w,b), expressed as B, represented as W, the result of the entire imaginary function is represented as { -1,1}), and we can define the function interval for this data case based on the parameter (W,B):

Therefore, if you want to get a value as large as possible in the function interval, at the time, you need a positive number as large as possible. At the time, you need a negative number that is as large as possible. So we can launch

When the function interval is large, the parameters chosen by the algorithm can better simulate the reality of the data to make better assumptions about the test data set.

On the given entire training data set, the function interval is:

2. Geometrical interval

Figure 1

Suppose the hypothetical function, represented by a line in Figure 1, is called separating the hyper-plane (the line used to separate the dataset, also called the decision boundary). All the data points in Figure 1 are on a two-dimensional plane, so this time the plane is separated into a straight line. However, if all the data points are in three-dimensional space, the plane is separated from the hyperspace plane. If the data is in an n-dimensional space, the hyperspace is separated from the n-1 dimension.

The farther the decision boundary is in the data point, the more credible the final prediction result is. The point A in Figure 1 is farthest from the decision boundary, stating that it belongs to Y=1, and that the C point is closest to the decision boundary and can be judged to be y=0 if the decision boundary is changed slightly. Therefore, the choice of separating the hyper-plane (decision boundary) depends on the interval between the closest point to the divider and the separating plane, which is the geometric interval, and the support vector is the closest point from the delimited hyper-plane. The larger the geometry interval, the more believable the classifier is.

Figure 2

According to Figure 2 can define the geometric interval, known as a, the imaginary function is, we know that W is a normal vector separating the plane, w/| | w| | is the unit normal vector that separates the superelevation plane. Point A can represent the case of Y=1, assuming ab=, so B (, 0). So we can get the following equation:

So the solution can get:

This solves only the y=1 situation, so the synthesis of y=-1 can be defined as the geometric interval of point a:

On the given entire training data set, the geometry interval is

3. Relationship between function interval and geometric interval

Function Interval/| | w| | = Geometric interval

function intervals are scaled with the scaling of W and B, but there is no point in selecting the parameters for the algorithm. The geometry interval does not scale with the scaling of W and B.

Lesson6 thoughts on "machine learning" at Stanford ——— 1, function interval and geometric interval

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Lesson6 thoughts on "machine learning" at Stanford ——— 1, function interval and geometric interval

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support