Decision Tree of machine learning algorithm

Last Update:2017-07-17 Source: Internet

Author: User

Tags id3

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. INTRODUCTION

An important task of the decision tree is to understand the knowledge contained in the data.

Decision Tree Advantages: The computational complexity is not high, the output is easy to understand, the loss of the median is not sensitive, you can process irrelevant feature data.

Cons: Problems that may result in over-matching.

Applicable data type: numeric and nominal type.

Two. General process of decision tree

1. Collect data: You can use any method.

2. Prepare the data: The tree construction algorithm only applies to nominal-type data, so the numerical data must be discretized.

3. Analyze data: You can use any method, after the construction tree is complete, you should check whether the graphic meets the expected criteria.

4. Training algorithm: Structure of the construction tree

5. Test algorithm: Use the XP tree to calculate the error rate.

6. Using the algorithm: This step is used in any supervised learning algorithm, using decision trees to better understand the intrinsic meaning of the data.

Three. Representation of decision Trees

The decision tree classifies instances by arranging instances from the burgundy nodes to a leaf node, and the leaf nodes are the categories to which the instances belong. Each node on the tree specifies a test for a property of the instance, and each successive branch of the node corresponds to one of the possible values of the attribute. The method of classifying an instance is to start with the root node of the tree, test the properties of the node, and then move down the branch that corresponds to the property value of the given instance. This process is then repeated on the subtree of the root of the new node.

The decision tree corresponds to an expression:

Four. Basic Decision tree Learning Algorithm 1. ID3 algorithm

Learn by constructing a decision tree from top to bottom. The construction process is from "which property will be tested at the root node of the tree?" "The question began. To answer this question, use statistical testing to determine the ability of each instance attribute to classify training samples separately. The best attributes for classification are selected as tests of the root node of the tree. Then create a branch for each possible value of the root node property and arrange the training samples under the appropriate branches. Then repeat the entire process, using the training sample associated with each branch node to select the best properties to be tested at that point. This creates a greedy search for a qualifying decision tree (greedy search), which means that the algorithm never re-considers the original selection.

specifically for learning Boolean functions. ID3 Algorithm Overview

ID3 (Examples,target_attribute,attributes)

Examples is the training sample set. Target_attribute is the target attribute of this tree to be tested. Attributes is a list of properties for the decision tree test that is available in addition to the target attribute. Returns a decision tree that can correctly classify a given examples.

If examples is positive, then return the single node tree root of label=+

If the examples is reversed, then return the single node tree root of label=+

If attributes is empty, return the value of the most common target_attribute in the single-node tree root,label=examples

? Otherwise start

? The best attributes for classifying examples capabilities in A←attributes

? Root's decision attribute ←a

? For each possible value of a VI

? Under root add a new branch corresponding to the test A=vi

? Make examples vi meet the subset of the A attribute value VI for examples

? If examples vi is empty

? Add a leaf node under this new branch, the most common Target_attribute value in the label=examples of the knot.

? Otherwise add a subtree under this new branch ID3 (Examples vi,target_attribute,attributes-{a})

? end

? Return to Root

2. Which attribute is the best classification attribute

Entropy (entropy): Describes the purity of any sample set (purity).

Entropy determines the minimum number of bits required to encode the classification of any member of the set S (that is, a member that is randomly drawn at a uniform probability).

If the target attribute has a C different value, then the entropy of the class s relative to the C State (c-wise) is defined as:

Pi is the proportion of s that belong to category I.

Information gain (information gain): The information gain of an attribute is reduced by the expected entropy resulting from the use of this attribute to split the sample.

Values (a) is a set of all possible values for property A, and Sv is a subset of the value of V for attribute a in S.

For example, suppose S contains 14 sample-[9+,5-]. In these 14 examples, it is assumed that 2 of the 6 and inverse examples in the positive example have Wind=weak, others have wind=strong. The information gain can be calculated as follows by the attribute wind classification of 14 samples.

Values (Wind) =weak,strong

S=[9+,5-]

Sweak←[6+,2-]

Sstrong←[3+,3-]

=entropy (S)-(8/14) Entropy (sweak)-(6/14) Entropy (Sstrong)

=0.940-(8/14) 0.811-(6/14) 1.00

=0.048

3. For example

The information gain for the four properties is calculated first:

Gain (S,outlook) =0.246

Gain (s,humidity) =0.151

Gain (S,wind) =0.048

Gain (s,temperature) =0.029

Based on the information gain criteria, the properties Outlook provides the best predictions for the target attribute Playtennis on the training sample.

Ssunny ={d1,d2,d8,d9,d11}

Gain (ssunny,humidity) =0.970-(3/5) 0.0-(2/5) 0.0=.970

Gain (Ssunny, temperature) =0.970-(2/5) 1.0-(2/5) 1.0-(1/5) 0.0=.570

Gain (Ssunny, Wind) =0.970-(2/5) 1.0-(3/5). 918=.019

Five A hypothetical space search in decision tree learning

The hypothetical space in the ID3 algorithm contains all decision trees, which is a complete space for finite discrete-valued functions of existing properties.

When the decision tree space is changed, ID3 only maintains a single current hypothesis.

The basic ID3 algorithm does not backtrack in the search.

The ID3 algorithm uses all of the current training samples at every step of the search and, based on statistics, feels how to simplify the previous assumptions.

About the C4.5 decision tree you can refer to http://www.cnblogs.com/zhangchaoyang/articles/2842490.html

Reference: http://www.cnblogs.com/lufangtao/archive/2013/05/30/3103588.html

Decision Tree of machine learning algorithm

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More