ID3 algorithm learning experience

Source: Internet
Author: User
Tags id3 list of attributes

ID3 (examples, targetattribute, attributes)
/*
Examples: training sample set
Targetattribute: Target attribute to be predicted
Attributes: list of attributes other than the target attributes for learning decision trees
*/

If the targetattribute values of all examples are the same as a, a single node tree with the node value a is returned.
Otherwise, further judgment and analysis are required based on other attributes.

If attributes is empty, no attribute can be used to determine the number of nodes. At this time, the number of nodes is returned. The Node value is the most common targetattribute value in all currently examples (this is a reasonable assumption)
Otherwise,
Select the best property bestattr based on the principle of maximum information gain. The value of the Tree node is bestattr.
For each possible value V of bestattr, the value on the branches is v.
Examples_v is the subset of the bestattr value in examples as V
Recursive computation subtree ID3 (examples_v, targetattribute, attributes-{bestattr })

 

Information Gain and entropy:
The entropy of the training sample set examples is the distribution of targetattribute values,
If the distribution is more even, the greater the entropy, that is, the larger the information. The smaller the distribution, the smaller the entropy, that is, the smaller the information.

How can we choose the best attributes for decision-making?
We select an attribute, divide examples into subsample Sets Based on each value, and then calculate the sum of entropy of the subsample set,
Information Gain = entropy of the original examples-sum of entropy of the subsample set
Intuitively, the best attribute should be the simplest classification and determination of the current training sample. Therefore, the entropy should be the root node of the decision tree with the minimum value, that is, when no decision is made, the uncertainty is the biggest, that is, the entropy is the largest; the leaf node of the decision tree should be the definite value, and the entropy is 0, which is the smallest.

The decision tree decision process is to reduce its uncertainty and entropy. Therefore, when selecting an attribute, you should select the attribute that can minimize entropy, that is, the attribute with the maximum information gain.

Perceptual knowledge should be the choice of attributes that can divide samples more evenly. Of course, this is not accurate. Uniformity cannot only refer to the size of the subsample set, but should also be related to the value. It is determined by calculating the information gain.
Therefore, the attribute with the largest information gain is the best one. (Note: The entropy calculation method of each subsample is the same as that of examples,
The sum of Entropy of a subsample set is not a simple accumulation, but a coefficient multiplied by the size of each subsample set/examples)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.