Data analysis algorithms

Source: Internet
Author: User
Tags id3

Data analysis algorithms Decision Tree

Decision trees are used to classify records in the dataset.

Assuming that each record contains several attributes, the decision tree is categorized by attributes.

ID3 algorithm

How do I decide which attribute to select for partitioning? The judging condition is that the information entropy of the data set is minimized according to the attribute (the smaller the information entropy indicates the more neat the data), the entropy difference is the largest.

Suppose that a attribute has n values, and a partition of a will get n branches, and the sub datasets in each branch delete the A attribute.

Recursively divides the data sets in the branch. End up being non-divided or all data is the same value.

This will generate a decision tree. Classification using leaf nodes of decision tree

c4.5 algorithm

ID3 extension, Difference:
1. The entropy difference/split degree is compared when selecting the partitioning attribute
2.c4.5 uses a pruning algorithm to reduce the over-adaptation caused by noise data.

http://blog.csdn.net/xuxurui007/article/details/18045943
http://blog.csdn.net/zjd950131/article/details/8027081

KNN

Determine the type of new data based on the properties of the dataset and the category of each record.

Calculates the distance between data points, taking the most recent type of k points as the prediction type for new data points.
method of calculating distances distance = sqrt (Delta (Attributex) **2 + Delta (attributey) **2 + Delta (Attributex) **2 + ...)

Naive Bayes (naive Bayes)

adjourned

Data analysis algorithms

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.