Data Mining Overview

Source: Internet
Author: User
Data Mining is effective, novel, and potentially useful from massive, incomplete, noisy, fuzzy, and random data sets, and the extraordinary process of an understandable model. It is a wide range of cross-discipline, including Machine Learning , Mathematical Statistics , Neural Network , Database , Pattern Recognition , Rough Set , Fuzzy Mathematics And other related technologies.

 

Because data mining is a cross-disciplinary subject that attracts researchers from different fields, it leads to many different terms. The most common terms are "Knowledge Discovery" and "Data Mining ". Data Mining is widely used inStatistics (first in statistics), data analysis, database, and management information systemKnowledge Discovery is mainly used in AI and machine learning.

Data mining can be roughly understood as a trilogy:Data preparation, data mining, and interpretation and evaluation of results).

There are several types of data mining tasks:Classification or prediction model data mining, Data Summary, Data Clustering, association rule discovery, sequence mode discovery, dependency or dependency model discovery, exception and trend discoveryAnd so on.

Based on the data mining objects, there are several data sources:Relational databases, object-oriented databases, spatial databases, temporal databases, text data sources, multimedia data, heterogeneous databases, legacy databases, and Web Data sources.

Based on the data mining method, the data can be roughly divided:Statistical methods, machine learning methods, neural network methods, and database Methods. Statistical methods can be subdivided into regression analysis (multiple regression, autoregressive, etc.), discriminant analysis (Bayesian, cost-efficient, non-parameter, and so on), clustering analysis (system clustering, dynamic clustering, etc.), exploratory analysis (principal component analysis, correlation analysis, etc.), and fuzzy set, Rough Set, and support vector machine. Machine learning can be subdivided into: inductive learning (decision tree, rule induction, etc.), Case-Based Reasoning (CBR), GeneticAlgorithmAnd Bayesian belief networks. Neural Network methods can be subdivided into forward neural networks (such as BP algorithms) and self-organizing neural networks (such as self-organizing Feature ing and competitive learning. The database method is mainly based on visual multi-dimensional data analysis or OLAP methods, as well as attribute-oriented induction methods.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.