It is well known that SVM is a global optimal solution by solving a two-time programming problem, which leads to a lot of memory and time in practical applications. Most of the existing methods reduce support vectors by reducing training samples, which speeds up training. This article is from Li Qing and other paper "Support vector machine pre-selection based on vector projection".
The basic idea is that M1 and M2 are 1 kinds of samples, the center point of 2 kinds of samples, and XF (0) is a sample of sample 1, the XF is projecting the point of XF (0) to m1m2. The center point of the classification sample is very good, the sum of all the points of the known sample 1 divided by the number of the line.
Definition:, xi (0) is a sample of sample 1, see the above figure R1 represents the longest length in m1xf, note that there is a direction.
The following defines the boundary vectors:
D is the distance of the m1m2. For Class 1 samples, the boundary vector is the sample 1 projection to the m1m2 distance is less than R1, larger than the R1-sample, for 2 class samples, the boundary vector is the sample 2 projection to the m2m1 distance is less than R2, greater than the r2-of the sample. In the case of r1+r2>d: the boundary vector is defined as:
The topic of the thesis is that the boundary vector set can contain most of the support vector set, and the pre-selection is to select the boundary vectors in all training samples as the next training, so it can greatly reduce the training of unnecessary samples and reduce the training time.
Of course there are a lot of other details, specific reference Li Qing and so on the paper "vector projection based support vector machine pre-selection".