Data Mining Overview

Source: Internet
Author: User
Recently, I have the opportunity to access some data mining things.
I personally feel that this technology will certainly have a great development prospect.
So I will use this article to explain my views on data mining.
The concept of data mining is explained step by step.

(1) generation of Data Mining

Development and Application of Data Storage Technology:
All technologies must be combined with applications. Data mining applications are built on data storage. Simply put.
Over the past 10 years, the wide application of Internet and the integration of enterprise information management have led to the rapid development of data storage technology.

Enterprises are used to planting previous paper work on computers. The use of databases provides the foundation for this work.
A large number of excellent database management systems have emerged, such as Oracle, sqlserver. DB2, and so on. But in general, these databases
All provide the same function: data storage.

People can build their own programs to use and manage the data. The original applications are clustered in two items for retrieval and update. A simple example is as follows:
When we submit an essay on cnblogs, the actual content of the article will be submitted to a web application running on the server
Database (a table.
When someone wishes to browse it, the Web application will retrieve it from the database and send the content to the browser of the viewer over the network.
You can also delete this document. The database will delete the document record (or update an isdeleted field)

Problems: massive data and data GRAVES:
Massive Data is an image of words.
How many sales records can a large supermarket (I'm sure it has applied the Sales Management System) generate every day?
The answer is: tens of thousands.
How much data can be generated in a year?
The answer is: massive data.

However, whether a dataset can be called a data grave is not defined by the amount of data.
When a dataset becomes meaningless, the data grave is generated.

Taking the sales records of large supermarkets as an example, retrieving any sales records is meaningless.
These huge sales records are always in the database, and no one sends greetings for 10 years. No one will have the energy to retrieve them one by one.
Simply storing them has no benefits, and it has not brought any value to the Enterprise.
Because it is too huge.

Statistics. Prototype of mining:
Some people will say: when we face a large number of sales records, we will not retrieve them one by one, but form a statistical report and submit it to sales.
The manager's desk. The sales manager can view this year's sales records, sales records for each quarter, and average monthly sales records.
This can be easily achieved by using databases. Of course, many enterprises have done the same.

A large number of enterprises said: Our management system can do this, it is enough, we are very satisfied.
However, in the highly competitive economic world, some people have proposed:
Can only the data stored in the database be provided? Whether they contain more knowledge and rules, we are not successful.
Found.
When determining the sales strategy for the next year, our sales manager often studies the sales statistics for the past few years.
It still seems too subjective, so there is a feeling that we are not making full use of the data we have.

Data mining:
Do you think Data Mining is still quite lacking?
A specific example can illustrate the problem:
Data mining has a very typical application called shopping basket analysis.
When deciding how to place the shelves in a supermarket for the next year, the sales manager always puts the bread and milk in a piece subjectively, but the goods are actually
In addition to subjective strategies, the Sales Manager also hopes that their systems can provide some more intuitive information through existing sales records.
Tip, or even directly form a shelf layout chart. The basis is that customers are generally more accustomed to putting what kind of goods in the structure car.

Therefore, the new system needs to automatically extract rules from the data and help decision-making information.

In this way, data mining is generated.
Experience:
Data Storage Technology ---> Statistics ---> Data Mining and decision support
The motive force is the enterprise's expectation for "making decisions through historical data.
The definition of data mining is also easily summarized as: extracting valuable information and knowledge from massive data ".

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.