Data Mining (DW) is a very important part of business intelligence (BI) all week. What is the data mining in the end, this article will explore this.
People often encounter this situation in their daily lives: supermarket operators want to be often bought together by the goods in order to increase sales; Insurance companies want to know what are the characteristics of customers who buy insurance; medical researchers want to identify common characteristics of patients with a disease from thousands of existing medical records, Thus providing some help to cure the disease.
For these problems, the data analysis tool in the existing information management system cannot give the answer. Because whether it is a query, a statistic, or a report, it is handled in such a way that a simple digital processing of the specified data does not extract the intrinsic information contained in the data. With the wide application of information management systems and the proliferation of data volumes, people want to provide a higher level of data analysis capabilities to better support decision-making or scientific research efforts.
It is in order to satisfy this requirement that we extract the useful information hidden in the large amount of data, and the application of machine learning to large database data Mining technology has been greatly developed.
Data Mining (DW), also known as Knowledge discovery in a database (Knowledge Discover DATABASE,KDD), is a high-level process of extracting trusted, novel, effective, and understandable patterns from a large amount of data.
Knowledge discovery in a database is a multi-step process, generally divided into:
The problem definition is familiar with the relevant areas, knowledge of the background and understanding of user requirements.
Extracts extracts relevant data from the database as required.
Data preprocessing mainly re-processing the data from the previous stage, checking the integrality of data and consistency of data, processing the noise data, and filling the lost data.
Data mining uses the selected knowledge discovery algorithm to extract the knowledge that the user needs from the data, which can be expressed in a particular way or used in some commonly used representations.
Knowledge assessment will present the knowledge found in a way that the user can understand, optimizing certain processing stages in the knowledge discovery process as needed until the requirements are met.
Thus, data mining is only one step of knowledge discovery in database, but it is also the most important step. Therefore, it is often possible to use KDD and data mining indiscriminately. Generally known in the field of research as a database of knowledge discovery, in the engineering field is called data mining.
This article addresses: what data mining is.
Source: Business Intelligence Alliance: http://freefeet.net/