Machine learning--analysis and implementation of decision tree (ID3 algorithm)

Source: Internet
Author: User
Tags id3


KNN algorithm Please refer to: http://blog.csdn.net/gamer_gyt/article/details/47418223


First, Introduction

Decision Treeis a predictive model; he represents a mapping between object properties and object values. Each node in the tree represents an object, and each fork path represents a possible property value, and each leaf node represents the value of the object represented by the path from the root node to the leaf node. The decision tree has only a single output, and if you want to have complex output, you can create an independent decision tree to handle different outputs. The decision tree in data mining is a frequently used technique that can be used to analyze data and can also be used as a predictive
second, the basic idea
1) The tree begins with a single node representing the training sample. 2) If the samples are all in the same class. The node becomes a leaf and is marked with that class. 3) Otherwise, the algorithm chooses the attribute with the most classification capability as the current node of the decision tree. 4) According to the different value of the current decision node attributes, the training sample data set Tli divided into subsets, each value formed a branch, there are several values to form a few branches. Evenly for a subset of the previous step, repeat the previous step, and hand 4 ' I to form a decision tree on each partitioned sample. Once an attribute appears on a node, it does not have to be considered by any descendant of that node. 5) Recursive partitioning step stops only when one of the following conditions is true:① All samples of a given node belong to the same class. ② No remaining attributes can be used to further divide the sample. In this case. Using a majority vote, the given node is converted to a leaf, and the category with the highest number of tuples in the sample is labeled as a category, and the class distribution of the node sample can also be stored .③ If a branch TC does not satisfy a sample that already has a classification in the branching, a leaf is created in most of the sample classes. Third, the construction method
The input to the decision tree construction is a set of examples with a category tag, which results in a binary tree or a multi-fork tree. The inner node (non-leaf node) of a binary tree is generally expressed as a logical judgment, such as the logical judgment of the form A=aj, where a is the attribute, and AJ is the value of the attribute: The edge of the tree is the branching result of the logical judgment. The inner node of a multi-fork tree (ID3) is an attribute, and the edge is all the values of that property, and there are several sides to some property values. The leaf nodes of the tree are category markers. The resulting decision trees are too large due to improper data representation, noise, or repeated subtree caused by decision tree generation. Therefore, simplifying the decision tree is an indispensable link. In search of an optimal decision tree, the following 3 optimization problems should be solved: ① generate the fewest number of leaf nodes, the ② of each leaf node is minimal, and the ③ generates a minimal leaf node with the least depth of each leaf node. iv. python code implementation
Call: Command line enters the directory where the code executes: import Trees dataset,labels = Trees.createdataset () trees.mytree ()


Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Machine learning--analysis and implementation of decision tree (ID3 algorithm)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.