AI machine Learning-decision tree algorithms-Concepts and learning processes

Source: Internet
Author: User

1. Concept

A decision tree is the process of classifying data by a series of rules, which provides a similar rule for what values are given under what conditions. Decision tree is divided into two kinds of classification tree and regression tree , the classification tree makes decision tree for discrete variables , and the regression tree makes decision tree for continuous variables .

Classification decision tree Model is a kind of tree structure that describes the classification of instances . The decision tree consists of a node and a forward edge. There are two types of nodes: internal nodes and leaf nodes, internal nodes represent a feature or attribute, and leaf nodes represent a class.

Intuitively, a decision tree classifier is like a flowchart that consists of a judgment block and a terminating block, and the terminating block represents the result of the classification (that is, the leaf of the tree). The Judgment module represents the judgment of a feature value (the feature has several values and the module has several branches).

Quote a chestnut on the Web:

Mother: Introduce you to someone.

Daughter: How old are you?

Mother: 26.

Daughter: Long handsome not handsome?

Mother: Very handsome.

Daughter: Is the income high?

Mother: Not very high, medium condition.

Daughter: Is it a civil servant?

Mother: Yes, I work in the Inland Revenue Department.

Daughter: Well, I'll meet you.

Use decision trees to represent:

As a code farmer often will constantly knock if, else if, else, in fact, has been used in the decision tree thinking. Just have you ever thought, there are so many conditions, with which conditional features first do if, which conditional features after doing if better? How to accurately quantify the criteria is the key to the decision tree machine learning algorithm.


2. The learning process of decision tree


The generation process of a decision tree is divided into the following 3 parts:

Feature Selection :

Feature selection refers to the selection of a feature from many features in the training data as the split standard of the current node, and how to choose features with many different quantitative evaluation criteria, thus deriving different decision tree algorithms.

1, why to do feature selection

With a limited number of samples, it is computationally expensive to design classifiers with a large number of features and poor classification performance.

2. The exact meaning of feature selection

The samples of high dimensional space can be transformed into low-dimensional space by mapping or transformation, and then the dimensionality can be reduced by selecting the redundant and irrelevant features by feature selection.

3. The principle of Feature selection

Obtaining the smallest subset of features without significantly reducing the classification accuracy, without affecting the class distribution and the characteristic subset should have stable adaptability and so on.

Decision Tree Generation :

Based on the selected feature evaluation criteria, the child nodes are generated recursively from top to bottom until the dataset is not divided and the decision tree stops growing. A recursive structure is the easiest way to understand a tree structure.

Pruning :

Because of the characteristics of the decision tree algorithm, it is easy to subdivide the characteristics of the learning, so that the classification is not allowed, for example, a special feature as a category of judging criteria, so that does not have a particular attribute of the data into this category. This is called fitting, English is called overfitting literal translation is over-matching, that is, matching is too thin, a bit too. To solve this problem, it is necessary to simplify the decision tree, to remove some of the characteristics too refined, the reaction in the tree structure is to remove some branches, the term is called pruning. There are two kinds of pruning techniques: pre-pruning and post-pruning.


More articles about the public number "kick genius"

AI machine Learning-decision tree algorithms-Concepts and learning processes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.