Python machine Learning: 7.1 Integrated Learning

Source: Internet
Author: User

The idea behind integrated learning is to combine different classifiers to get a meta-classifier, which has better generalization performance than a single classifier. For example, let's say we've got a forecast for an event from 10 experts, and integrated learning can combine these 10 predictions to get a more accurate forecast.

We will learn later that there are different ways to create an integration model, and this section first addresses a basic question: Why integrate learning? Why is she better than a single model?

This book is for beginners, so the integration study here we only focus on the most basic integration method: Voting method (majority voting). The voting law means that when we get the final forecast category, we see which category is most of the single classifiers predicted, and most of them are generally greater than 50%. More strictly, the voting law only applies to the two classification, and of course he is easily extended to a multi-classification situation: Majority vote (plurality voting).

Shows an example of a voting law, with a total of 10 basic classifiers:

We trained M different classifiers with a training set, where the classifier could be a decision tree, SVM, or LR. We can of course use the same classifier, but use different parameters or different training sets (such as autonomous sampling) when training each model. Random Forest is an example of this strategy, which is composed of different decision tree models. This diagram shows the integration method steps using the voting strategy:

Voting strategy is very simple, we collect each single classifier of the forecast category, the number of votes as a result of the forecast:

Take the two classification as an example, the category Class1=-1, class2=+1, the process of voting prediction is as follows, the prediction results of each single classifier are added, if the value is greater than 0, the predicted result is a positive class, otherwise the negative class:

Read here, I think we all have a question: with what integration learning is better than a single classifier effect? The truth is simple (a little combination of mathematical knowledge), assuming for a two classification problem, there are n single classifiers, each single classifier has an equal error rate, and the single classifier is independent of each other, error rate is irrelevant. With these assumptions, we can calculate the error probability of the integration model:

If n=11, the error rate is 0.25, to integrate the result prediction error, at least 6 single classifier prediction results are incorrect, the error probability is:

Integration result error rate is only 0.034 oh, much smaller than 0.25. The inheritance result is better than the single classifier, also has the premise, is your this single classifier's ability cannot be too bad, at least is better than the random guessing result, at least.

As can be seen, as long as the single classifier performance is not too bad, the results of integrated learning is always better than the single classifier.

Python Machine learning Chinese catalog (http://www.aibbt.com/a/20787.html)

Reprint please specify the source, Python machine learning (http://www.aibbt.com/a/pythonmachinelearning/)

Python machine Learning: 7.1 Integrated Learning

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.