1-linear Support Vector Machine
We define this as margin, then the problem of determining the optimal division before is turned into the problem of finding the maximum margain.
For the lines represented by several watts to be selected, the problem is converted to the problem of using the corresponding W to compare relative distances.
This defines w as the direction vector, and b for the previous w0, which is BIA.
Since w is the normal direction of the point to the line, the problem turns into the problem of projection.
Because each point corresponds to the symbol yn only when and distance represents the absolute value of the internal symbol is + the time to explain the correct division, so you can multiply the yn to remove abs ()
The distance here is a tolerance, so we choose the nearest one.
Because we are the main purpose is to choose B and w to make the qualitative shortest, so we can shrink the margin as a whole without affecting the maximum selection point.
Here, the second half of the product is 1, then the goal's maximum value discussion changes to 1/| | w| | Discussion, the problem is simplified.
However, yn (xtxn+b) = 1 is a strong constraint, which relaxes it to >=1.
This section skips some of the desirable discussions of extremum points.
Here again, a very tricky common technique is used to turn the maximization problem into a method to minimize its reciprocal, then go to the root number and add a constant of 0.5 for the processing behind it:
P.S. 1/SQRT (2) *[() T. (3,0) + ( -1)] for EX.
With this tool of choice, we elicit SVM, that is, those that can achieve the most desirable points (vectors) and define them as support vector machines.
Luckily, this is a two-time planning problem, there is a workable solution tool, so we now turn it into a standard form,
On the left is our request, the right side is the standard form:
In other cases, such as encountering nonlinear problems, the mapping can be converted to linear, and the other is the so-called Hard-margin, which is a fully-divided line.
Zn = (xn)-remember? :-) Z-space conversion, followed by the substitution of Z directly
If you compare this with the regularization in the MLF course, you are actually bound to the WTW:
The essence is similar.
Another understanding: If we consider the constraints in SVM as a filtering algorithm, for a number of points in a plane,
It is possible that some margin non-conforming methods will be ignored, so this is actually a reduction of the problem of the VC dimension, which is also an optimization direction of the problem.
With the condition of M > 1.126, better generalization performance was obtained compared to PLA.
Taking a circle midpoint as an example, some partitioning situation cannot be achieved in this constraint, then we get a new upper bound, which is less than DVC, which is related to the data itself.
In this way, we have solved a contradiction between the complexity of the model and the number of assumptions:
Next, we will discuss nonlinear SVM.
Machine learning Techniques-1-linear Support Vector Machine