Decision Tree algorithm

Source: Internet
Author: User

A collection of if-then rules, with the advantage that model is readable , Fast Classification Speed .

Algorithms commonly used in decision Trees: ID3 algorithm, C4.5 algorithm, cart algorithm

1. Entropy ( Entropy , and called information entropy)

Therefore, entropy depends only on the distribution of X, regardless of the value of X.

    • The greater the entropy, the greater the uncertainty of the random variable x.
    • If there is a 0 probability, make. Unit is bit (bit) or NAT

2, conditional entropy-- indicates the uncertainty of DataSet D under the condition of known feature a


Conditional probability formula:

3, information gain (preference to the selection of more than the characteristics of the value)

4. Information gain ratio  

CART algorithm Classification and regression tree, categorical regression trees

1, (least squares) regression tree --the least square error criterion

The steps are as follows:

2. Classification Tree -Gini index (Gini index) Minimum criteria

Gini index (representing the collection D of uncertainty)

The steps are as follows:

1) For each feature of DataSet D, for all possible values of a, calculate the Gini index at the time

2) Select the minimum characteristic of the Gini index and its corresponding segmentation point as the optimal feature and the optimal segmentation point.

3) The training set is assigned to two sub-nodes according to the feature and its segmentation point, and the two sub-nodes are recursively called (1) (2) until the stop condition is met.

4) Generate Cart decision tree

Algorithm Stop Condition:

    • The number of samples in the node is less than the predetermined threshold.
    • The Gini index of the sample set is less than the predetermined threshold
    • No more features.

Decision Tree algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.