[Introduction to machine learning] Li Hongyi machine learning note -35 (Ensemble; Integration method)
Ensemble
Commonly known as Rumble, want to get very good performance, basic must use this hand.
You already developed some algorithms and codes. Lazy to modify them. Ensemble:improving your machine with little modification. Feamework of Ensemble
Pull out a bunch of classifiers, preferably different, complementary, properly put them together, each classifier play a different role.
Ensemble has many different ways of dealing with different situations. ensemble:bagging
Let's go back to bias and variance.
A complex model would have large variance.
We can average complex models to reduce variance.
If We average all the f*,is it is close to f^.
So, if you want to make a big model error variance, you can train multiple variance large model, and then average them, this method is called bagging.
Sampling N ' Examples with replacement,then ...
This approach would was helpful when your model was complex, easy to overfit. e.g. decision tree. decision Tree
The decision tree is easy to fit and the model is simple.
The decision tree can not only judge the simple problem of the graph, but also solve many complex problems.
Like what...... The following figure of this beautiful girl, the first sound of the body to represent category 1, the other is called category 0, the value on this two-dimensional plane, is input, to determine whether the first tone of the body.
Experimental manifestations of different tree depths
It is easy to understand that the decision tree continues to be deep enough to be perfect training data, because it can give each of the data a class, it is obvious that it is easy to fit.
Decision tree do bagging is Random Forest. Random Forest
Add some random features to make the tree less like the tree.
Each time the node, you have to decide which feature can be used, which is not to use, to avoid each tree is long like.
There is a bagging method called Out-of-bag. out-of-bag (OOB)
The goal of bagging is not to perform better on training data, but to reduce the bias, resulting in smoother function (more delicate skin of the initial tone).
ensemble:boosting
Boosting target and Bagging is the opposite, Bagging is to reduce the fit, and boosting is even if not fit training data model, but also to find ways to make performance better, boosting by putting a lot of weak classifiers combined to help get a strong classifiers.
It is like saying that, as long as an algorithm can be a little bit better than a blind guess, it can become a super-strong algorithm through boosting.
Note: The classifiers is learned sequentially. How to obtain different classifiers?
Different classifier by training on different training data, how does the different training data get it?
Re-weighting can make the number of sample not an integer but a decimal. Adaboost
The performance of F1 for new weights would is random.
Give me a chestnut.
Re-weighting Training Data
For example, the wrong question score becomes big, the question score becomes small, 75 points abruptly the whole to fail.
How to find a new training set that fails F1 x?
What is the value of D1?
The answer to the weight is Z1 (1-ε1), the wrong weight is z1ε1, all the weight will be taken D1, all the correct weight will be removed D1, the wrong equality column inequality. Algorithm for AdaBoost
Next Part 2