Understanding SVM (iii)--extending to multiple classes

Last Update:2014-12-02 Source: Internet

Author: User

Tags svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Understanding SVM (iii)--extending to multiple classes

In the first two series, the basic principle and code implementation of SVM are discussed respectively, and how to solve the linear non-division situation. This time we'll explain the last of the SVM: SVM solves a multi-class classification problem.

1. One vs. other

This method of improving SVM is simple and rough and easy to understand. In detail, for example, we have 5 categories of data.

Step 1: The sample of Category 1 is classified as a positive sample, and the remaining samples of the 2,3,4,5 class are taken as negative samples. Using SVM classifier, a class two classifier is obtained.

Step 2: The sample of Category 2 is classified as a positive sample, and the remaining samples of the 1,3,4,5 class are taken as negative samples. Using SVM classifier, a class two classifier is obtained.

Step 3: The sample of Category 3 is classified as a positive sample, ...

Step 4: Set the sample of Category 4 as a positive sample, ...

Step 5: Set the sample of Category 5 as a positive sample, ...

Step 6: For the test sample, using the 5 classifiers obtained by the above training, each classifier can draw a conclusion, either Class I, or other (the other), until we find a classifier saying that the test sample is mine.

This improved method is simple and time complexity is not very high. However, think about the problem-sometimes there are 5 classifiers that say such test samples are their own, or are not their own ... What about this time?

If they are all their own, this is called classification overlap phenomenon . The classifier that can choose the most interval of SVM to determine the final result, or use the principle of voting;

If it is not their own, it is called non-classification phenomenon . This time is not easy to solve, can not get the right results.

In addition, there is a drawback to this approach: the problem of class imbalance is thought to have been caused. Moreover, when the data category is many times, the other class of data will be one of the class of a lot of time, the SVM classifier will be seriously wrong to the results of the other large class, this time is not non-classification phenomenon, but you SVM classifier to directly divided the wrong ... For class imbalance issues, we'll talk about it later in the blog.

2. One vs. one

This approach also translates multiple types of problems into two types of problems, except that each of the two types of problems is the "one vs. a" approach. There is no such thing as imbalance at this time.

Step 1: The sample of Category 1 is classified as a positive sample and category 2 is a negative sample. Using SVM classifier, a class two classifier is obtained.

Step 2: The samples of category 1 were classified as positive samples and category 3 as negative samples. Using SVM classifier, a class two classifier is obtained.

...... (there are altogether C (5, 2) = 10 classifiers)

Step 11: Let the test sample be judged by the following classifier in turn, each time it will be a class 1th or 2nd, and then vote for 5 categories, choosing the highest as the final result (a category of up to 4 votes).

This method also has a classification overlap phenomenon. Moreover, when the dataset category increases, we need to learn a lot of class two classifier, more than the one-to-one vs. other methods, time complexity can not stand ~

3. Dag Mode

Similar to the "one vs. one" method above, we can construct a forward-free graph with each of the two class classifiers as nodes. As shown in the following:

The image describes the decision-making process of the method, that is, the classification from the top down, each time the result of the classifier determines which direction to the left and right. Continue to judge until the leaves are reached, each of which corresponds to this category. This method does not need to run each classifier, improve the test speed, and avoid the same large number of votes when the classification overlap phenomenon. However, the disadvantage of this method is "error accumulation", assuming that the first node classifier is directly divided wrong, then the following is always followed by the wrong.

4. Decision tree-based approach

In fact, the above 2,3 method needs to train the number of classifiers is terrible, based on the decision-tree method can reduce the number of learning the second class classifier problem. For a dataset, we can use some clustering methods (such as Kmeas) to divide the dataset into two subclasses, and then further divide the two subclasses so that it loops until only one category is included in the subclass. In this way, we get an inverted two-fork tree. Finally, the SVM classifier is trained in the decision nodes of the binary tree, and here we can find that the classifier we need has been reduced a lot. Here, we construct different tree structures (not necessarily full binary trees), and we get different methods. However, with a fully binary tree structure, the number of class two classifiers to learn is minimal.

5. Error-Correcting output coding method

Assuming that a dataset has K-class, we use the L-class classifier (not just SVM) to get a result of the L classification, with each result represented by +1 and-one. Thus, for the K-class dataset, we can learn a k*l matrix.

Then, a test sample, we use the same method to get the length of the test sample of the vector of L, take this vector and each row in the k*l matrix Hamming distance, the smallest distance is the result of the test sample classification.

The above 5 kinds of methods all have advantages and disadvantages, the specific application can choose the best effect of a use. But now there are many data mining methods that are directly oriented to multi-class classification. We'll show you the following.

Note that the above is related to the class imbalance learning problem, the following will specifically open a column for this kind of problem of several major solutions, please continue to follow this blog ~ Thank you ~ ~

Understanding SVM (iii)--extending to multiple classes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More