Machine learning techniques-decision tree and CART classification regression tree construction algorithm

Source: Internet
Author: User

Course Address: Https://class.coursera.org/ntumltwo-002/lecture

Important! Important! Important ~

I. Decision trees (decision tree), Pocket (Bagging), Adaptive Enhancement (AdaBoost)

    • When the bagging and AdaBoost algorithms are reclassified, it is time for all weak classifiers to function simultaneously. The difference between them is whether each weak separator has the same weight as the subsequent blending generation G.
    • Decision Tree is a conditional fusion algorithm that allows a classifier to function only on a per-condition basis at a time.

Second, the basic decision tree algorithm

1. From the perspective of recursive view decision tree, according to the characteristics of the selection of branching conditions, the continuous generation of sub-tree, all the sub-numbers constitute the final decision tree.

For example, the following is a decision tree based on the characteristics of home time, dating situation, job deadline, and whether to watch Mooc online courses.

2. Basic Decision Tree Algorithm description

    • Determines the branch condition. Branching conditions can be determined by a person or generated by an algorithm
    • Dividing training data by branching criteria D
    • The subtree is constantly recursively generated according to the branching conditions until the termination condition is met
    • In order to prevent overfitting, limit the complexity of the model, usually by pruning (pruning) to regularization decision tree

Three, cart algorithm (categorical regression tree)

Lin said here is easy to understand, can refer to: http://blog.csdn.net/u011067360/article/details/24871801?utm_source=tuicool&utm_medium= Referral

    • Cart is a two-fork tree with only two branches
    • Determine the condition of the branch according to the purity. For the usual selection of Gini for classification, LSD or lad can be used for successive target variables.

Gini index: Number between 0~1, 0-exactly equal, 1-completely unequal. The more cluttered the categories that are contained in the population, the greater the Gini index (similar to the concept of entropy).

    1. Find the best feature to be segmented according to Gini
    2. If the node cannot be divided, save the node as a leaf node
    3. Execute Two-dollar segmentation
    4. In the right subtree recursive call Createtree () method, create subtree
    5. In the right subtree recursive call Createtree () method, create subtree

Four, cart and AdaBoost meta-algorithm application comparison

Cart is more efficient than adaboost because the former is "conditionally cut" and the latter is completely "horizontal and vertical".

Five, the characteristics of the cart practice

Machine learning techniques-decision tree and CART classification regression tree construction algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.