Python Machine Learning Theory and Practice (5) Support Vector Machine and python Learning Theory

Support vector machine-SVM must be familiar with machine learning, Because SVM has always occupied the role of machine learning before deep learning emerged. His theory is very elegant, and there are also many variant Release versions, such as latent-SVM and structural-SVM. In this section, let's take a look at the SVM theory. In (Figure 1), a diagram shows two types of datasets. In Figure B, C, and D, a linear classifier is provided to classify data? But which one is better?

(Figure 1)

For this dataset, the three classifiers are good enough, but it is not. This is only a training set, and the distribution of samples in actual tests may be scattered, there are various possibilities. To cope with this situation, we need to try to make the linear classifier as far as possible from the two datasets, because this will reduce the risk of the actual test samples crossing the classifier, improve detection accuracy. The idea of maximizing the distance from a dataset to a classifier (margin) is to support the core idea of vector machines, and the samples closest to the classifier become support vectors. Now that we know that our goal is to find the maximum margin, how can we find the support vector? How to implement it? The following figure shows how to complete these tasks.

(Figure 2)

Assume that the line in (Figure 2) represents a superplane. In order to view the line in one dimension, the features are from the superplane dimension plus one dimension. As shown in the figure, the features are in two dimensions, the classifier is one-dimensional. If the feature is three-dimensional, the classifier is a plane. Assume that the superplane's analytical formula is, and the distance from point A to the superplane is, the following distance proof is given:

(Figure 3)

In (figure 3), the blue-colored diamond represents the superplane, Xn indicates a point in the data set, W indicates the superplane weight, and W indicates that the superplane is perpendicular to the superplane. It is very simple to prove the vertical. Assume that X' and X' are a point above the surface,

Therefore, W is perpendicular to the superplane. Knowing that W is perpendicular to the superplane, the distance from Xn to the superplane is actually the projection of the line of x at any point on the Xn and the superplane on W, as shown in Figure 4:

Obtain the formula shown in (formula 5) by using the formula of the Laplace multiplier:

(Formula 5)

In formula 5, we use the Laplace multiplier function to evaluate the derivation of W and B, respectively. To obtain the extreme point, let the derivative be 0 and get

And then place them in the formula (formula 6) of the Laplace multiplier:

(Formula 6)

(Formula 6) the last two rows are the optimization functions to be solved. Now we only need to make a secondary plan to obtain alpha. The Quadratic plan optimization solution is shown in (Formula 7:

(Formula 7)

After alpha is obtained through Formula 7, W can be obtained using the first row in formula 6. So far, the formula derivation of SVM has been basically completed. It can be seen that the mathematical theory is very rigorous and elegant. Although some colleagues think it is boring, it is better to look at it from the beginning, the difficulty is optimization. The second-level planning solution requires a large amount of computing. in practical applications, the SMO (Sequential minimal optimization) algorithm is commonly used. The SMO algorithm is intended to be placed in the next section in combination with the code.

References:

[1] machine learning in action. Peter Harrington

[2] Learning From Data. Yaser S. Abu-Mostafa

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.