Data Mining (introductory knowledge)

Source: Internet
Author: User



Recently looking at a book called "Big Talk Data Mining", a simple summary summarizes some of the basic theoretical knowledge of data mining:

1.Data Mining (also known in academia as Kdd:knowledge discovery in database) is extracted from a large number of incomplete, noisy, fuzzy, random data, which we do not know beforehand, The process of knowledge of potentially useful information. (most algorithms are based on the law of large numbers of statistics)

What 2.Data Mining can do: Data mining tasks include descriptive tasks and predictive tasks in two ways:

descriptive tasks include Clustering, association analysis, sequencing, anomaly detection , etc.

predictive tasks include regression and classification .

(1): Association rule Mining (1994 Apriori algorithm): also includes sequence and time series                                               

(2): Cluster analysis (continuous: K-means,k-medoids,

(Discrete: K-mode, K_ prototype)

(Non-spherical clusters: density-based clustering algorithm: DBSCAN, OPTICS, Denclue)

(Hierarchical Clustering algorithm: coalescing && splitting)

(Visual clustering algorithm)

(3) Prediction: The basic principle is the black-box sub-model (regardless of the specific relationship between the input and output, only the relationship between them, but does not care about the causal relationship between them)

(4) Regression: linear regression, linear fitting

(5) Detection of deviations: Description of a few, extreme exceptions to the analysis object, revealing the underlying causes

Follow-up supplement ...





Data Mining (introductory knowledge)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.