Li Hongyi Machine Learning Note -35 (Ensemble part 1; Integration Method Part 1)

Source: Internet
Author: User

[Introduction to machine learning] Li Hongyi machine learning note -35 (Ensemble; Integration method)

PDF VIDEO
Ensemble

Commonly known as Rumble, want to get very good performance, basic must use this hand.
You already developed some algorithms and codes. Lazy to modify them. Ensemble:improving your machine with little modification. Feamework of Ensemble

Pull out a bunch of classifiers, preferably different, complementary, properly put them together, each classifier play a different role.

Ensemble has many different ways of dealing with different situations. ensemble:bagging

Let's go back to bias and variance.

A complex model would have large variance.
We can average complex models to reduce variance.
If We average all the f*,is it is close to f^.
So, if you want to make a big model error variance, you can train multiple variance large model, and then average them, this method is called bagging.

Sampling N ' Examples with replacement,then ...

This approach would was helpful when your model was complex, easy to overfit. e.g. decision tree. decision Tree

The decision tree is easy to fit and the model is simple.

The decision tree can not only judge the simple problem of the graph, but also solve many complex problems.
Like what...... The following figure of this beautiful girl, the first sound of the body to represent category 1, the other is called category 0, the value on this two-dimensional plane, is input, to determine whether the first tone of the body.

Experimental manifestations of different tree depths

It is easy to understand that the decision tree continues to be deep enough to be perfect training data, because it can give each of the data a class, it is obvious that it is easy to fit.

Decision tree do bagging is Random Forest. Random Forest

Add some random features to make the tree less like the tree.
Each time the node, you have to decide which feature can be used, which is not to use, to avoid each tree is long like.
There is a bagging method called Out-of-bag. out-of-bag (OOB)


The goal of bagging is not to perform better on training data, but to reduce the bias, resulting in smoother function (more delicate skin of the initial tone).

ensemble:boosting

Boosting target and Bagging is the opposite, Bagging is to reduce the fit, and boosting is even if not fit training data model, but also to find ways to make performance better, boosting by putting a lot of weak classifiers combined to help get a strong classifiers.

It is like saying that, as long as an algorithm can be a little bit better than a blind guess, it can become a super-strong algorithm through boosting.
Note: The classifiers is learned sequentially. How to obtain different classifiers?

Different classifier by training on different training data, how does the different training data get it?

Re-weighting can make the number of sample not an integer but a decimal. Adaboost

The performance of F1 for new weights would is random.

Give me a chestnut.

Re-weighting Training Data

For example, the wrong question score becomes big, the question score becomes small, 75 points abruptly the whole to fail.
How to find a new training set that fails F1 x?

What is the value of D1?

The answer to the weight is Z1 (1-ε1), the wrong weight is z1ε1, all the weight will be taken D1, all the correct weight will be removed D1, the wrong equality column inequality. Algorithm for AdaBoost

Next Part 2

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.