Machine Learning-Tree regression

Source: Internet
Author: User
Tags id3

linear regression creates a model that needs to fit all sample points (except local weighted linear regression). When the data has many characteristics and the relationship between features is very complex, the idea of building a global model is too difficult and slightly awkward. Moreover, many problems in real life are non-linear , and it is not possible to fit any data using a full-limitation model.

One way to do this is to cut the dataset into a number of easily modeled data, and then use linear regression techniques to model it. If the linear model is still difficult to fit after the first cut, the segmentation continues.

A decision tree is a greedy algorithm that makes the best choice within a given time, but does not care about achieving global optimality .

CART (Classification and regression trees, categorical regression tree)

Previously used classification tree Construction algorithm is ID3,ID3 decision Tree Learning algorithm is based on information gain as the criterion to select the partitioning attribute. ID3 's approach is to split the data by selecting the current best feature each time, and then slicing it by all possible values for that feature. That is, if a feature has 4 values, the data will be cut into 4 parts. Once a feature is sliced, the feature will no longer work in the subsequent algorithm execution, so there is a view that this segmentation is too fast. Another method is the two-yuan segmentation method , which cuts the data set into two pieces at a time. If a characteristic value of the data is equal to the value required by the Shard, the data enters the tree's left subtree, and vice versa into the tree's right sub-tree.

  There is another problem with the ID3 algorithm , which cannot directly handle continuity data . The ID3 algorithm can be used only if the continuous features are converted to discrete types beforehand.

  The cart algorithm uses a two-dollar shard to process continuous variables. A slight modification to the cart can deal with the regression problem. The cart Decision Tree uses the Gini index to select the partitioning attribute, which is used to measure the purity of the data set .

Machine Learning-Tree regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.