What is Data mining
The first two days to see a group of people asked, what is data mining, now on the concept of data mining analysis, and try to use plain English to say what the data mining in the end is a what things, why big data to the data mining also fire (in fact, it is quite fire).
First look at the concept:
Data Mining (English: mining), also translated into data mining. It is a step in database Knowledge Discovery (English: Knowledge-discovery in Databases, abbreviation: KDD). Data mining generally refers to the process of searching the information hidden in a large amount of data through an algorithm. Data mining is often related to computer science and is achieved by means of statistics, online analytical processing, information retrieval, machine learning, expert systems (relying on past rules of thumb), and pattern recognition.
Introduction to Data Mining
The straightforward data mining is to find valuable data in a huge amount of data, to provide the basis for business decision-making.
The values include the following categories:
1. Relevance
Correlation analysis refers to the analysis of two or more correlated variable elements to measure the relative degree of the two variable factors. Correlation elements need to have a certain connection or probability before correlation analysis can be carried out. Relevance does not mean causality, nor is it simple personalization, and the scope and field of relevance covers almost every aspect of what we see, and the definition of relevance in different disciplines varies greatly. Used to determine the difference between data, that is, whether one of the attributes or a few properties will affect other properties, and how large the effect is.
is an example of correlation:
2. Trends
It means to compare the actual results with the historical data of the same kind of indexes in the financial statements in different periods, so as to determine the financial situation, the change trend of the management results and the cash flow and an analytic method. Can be used to predict the trend of data and trends through the stitches, but also by the chain, the same way to explain the results of the comparison.
As shown in the following:
3. Characteristics
See what the content of the specific analysis, such as the Internet class, is the user portrait of this kind of demand, according to different users to the user group to play the corresponding label.
is a:
Presentation form
The results of data mining generally have several forms of presentation:
1. Forms
The first form of presentation, a cross-table display, such as:
2. Chart
Compared to the chart more display, it is very intuitive to see the overall situation of the data, such as:
3. Decision Tree
As a proverb, the idea of decision tree classification is similar to looking for objects. Now imagine a girl's mother to introduce a boyfriend to this girl, so the following dialogue:
Daughter: How old are you?
Mother: 26.
Daughter: Long handsome not handsome?
Mother: Very handsome.
Daughter: Is the income high?
Mother: Not very high, medium condition.
Daughter: Is it a civil servant?
Mother: Yes, I work in the Inland Revenue Department.
Daughter: Well, I'll meet you.
This girl's decision-making process is a typical classification tree decision. The equivalent of dividing a man into two categories through age, appearance, income and civil servants: see and disappear. Suppose the girl's requirements for a man are: 30 years old, above-average and high-income or middle-income civil servants, then this can be used to represent the girl's decision-making logic:
Areas covered by data mining
Data mining is an interdisciplinary research field in computer science, and its research methods are closely related to many other sciences, such as: statistics, machine 2 learning, expert system, information retrieval, social network, natural language processing and pattern recognition, etc.
Summarize
Here is a brief introduction of the concept of data mining and data mining of the form and data mining in the end can do something, in the future will continue to deep and the introduction, in order to improve together with you.
Come with me. Data Mining (18)--What Is Data mining (1)