Comparative Analysis of Data Warehouse, OLAP and Data Mining

Source: Internet
Author: User
Keywords data warehouse data mining olap
1. The relationship among data warehouse, OLAP and data mining
In more mature systems, data analysis process is based on data warehouse, OLAP and data mining complement each other. Data warehouse will store data from various data sources according to different topics, and perform a series of filtering and cleaning work such as extraction, transformation and loading of original data. OLAP presents data to users in multiple ways through multiple perspectives and levels. Data mining uses different algorithms to reveal the regularity of data to users, so as to assist business decision-making. For example, the In the application of CRM (Customer Relationship Management), data warehouse selects and stores data based on the theme of "customers"; OLAP is responsible for analyzing the basic information of customers, savings account information, historical balance information, bank transaction log, etc., which is presented to managers in the form of dynamic analysis report, histogram, line chart, pie chart, etc., so that they can understand and master customers' information from many aspects Dynamic, so as to find the customer's trading habits, customer loss form, better for different types of customers, in different periods of adaptive product marketing activities. Data mining can build a model through historical data, analyze the future trend on the basis of fitting the history, and judge which factors will probably mean the final loss of customers, so as to avoid it.
2. From database to data warehouse
Traditional database technology can be divided into two categories: operational and analytical. Operational type, also known as transaction processing, is the daily operation of the database online, usually the query and modification of one or a group of records, mainly for the specific application of the enterprise, focusing on response time, data security and integrity; analytical type is online access and analysis for specific problems, through a variety of possible observation forms of information for stability, consistency and exchange Interactive access allows analysts to conduct in-depth observation of data. The traditional database can meet the daily business processing work of enterprises, but it is difficult to achieve the requirements of data analysis and diversified processing. The emergence of data warehouse makes up for this defect. The original single data resource, namely database centered data environment, is developed into a subject oriented system environment, which is specially used to support high-level decision analysis. Data warehouse is not a substitute for database. Most of the data warehouse use relational database management system to manage data.
3. Differences and connections between OLAP and data mining
The main difference between OLAP and data mining lies in: in the auxiliary decision-making, the former is driven by a series of hypotheses established by users, and it is a deductive reasoning process to confirm or overturn these hypotheses through OLAP; data mining is to actively find models in massive data through induction, and automatically discover the value information hidden in the data. For example, an OLAP analyst may think that users who open credit cards in a certain region will be more active in consumption. For this assumption, he may observe the credit card account attributes of users who apply for credit cards in those rich areas. If the results are not clear enough, he may want to take age into account. Until he thinks that he has found various variables that can decide whether to take the initiative in credit card consumption, then according to these variables, he plans the marketing methods of his bank products, and puts the marketing resources on the customers who are most likely to accept their products to the greatest extent. For data mining analysts, we assume that they have reached the same conclusion as this OLAP analyst, but they have reached the same conclusion in the opposite way. Data mining analysts put various factors or variables into data mining tools, and the mining tools build their own models. After removing a series of factors or variables that are not related to or significant to credit card consumption, they also get the same results. Here we assume that they are all regional and age factors. Of course, the factors or variables derived from them are not the same. Simply describe the difference between them: compared with OLAP, data mining gives more initiative to mining tools, which can be regarded as the primary application of artificial intelligence to a certain extent. In addition, OLAP is limited to structured data, focusing on interaction with users, rapid response and providing multi-dimensional views, while data mining can also analyze unstructured data such as text, space and multimedia.
Although the two have great differences in different angles and levels, OLAP and data mining also have certain complementarities. OLAP's analysis results can provide analysis basis for data mining. Data mining can expand OLAP's analysis depth and explore more complex and detailed information. 
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.