1. The concept of Data Warehouse began in the 1980s S and first appeared in the book William H. Inmon, known as "the father of Data Warehouse. As people deeply recognize and constantly improve the research, management, and maintenance of large data systems, after summing up, enriching, and concentrating the experience of multiple lines of enterprise information, A more precise definition is provided for the data warehouse, that is, "data warehouse is a subject-oriented, integrated, time-related, and unchangeable data set in enterprise management and decision making ". Data Warehouses do not have a strict theoretical basis for data, nor have mature basic models. They prefer engineering and have a strong engineering nature. Generally, data extraction, storage and management, and data performance are divided into three basic aspects based on their key technical components.
The key and requirement of a data warehouse is that it can accurately, securely, and reliably retrieve data from the database, process and convert the data into regular information, and then analyze and use the data for management personnel. Data Warehouse is mainly used in decision support systems. Its main purpose is to "extract" Information and expand it to process the application of the data warehouse-based decision support system (DSS.
2 The Data warehouse-based decision support system (DSS) is composed of three components: Data warehousing and OLAP, on-Line Analytical Pro-cessing), Data Mining technology (Data Mining ).
OLAP, On-Analytical Pro-cessing) it enables analysts, administrators, or executors to analyze the information converted from raw data that is truly understood by users and truly reflected by the enterprise to the characteristics from multiple perspectives. fast, consistent, and interactive access, this gives you a better understanding of the data. OLAP aims to meet the query and report requirements of decision support or multi-dimensional environments. Data Warehouses focus on storage and management of decision-making-oriented data, while OLAP focuses on data analysis of data warehouses and converts them into Auxiliary Decision-Making information. A major feature of ola p is multi-dimensional data analysis, which forms a combination and complementary relationship with the multi-dimensional data organization of the data warehouse. Therefore, the combination of OLAP technology and data warehouse can better solve the problem that traditional decision support systems need to process both a large amount of data and a large amount of numerical computing.
OLAP multi-dimensional data analysis provides in-depth analysis of the data provided by the database by cutting, drilling, and rotating the dimensions of multi-dimensional data, and provides decision-making support for decision makers. Multi-dimensional structure is the pillar of decision support and the core of OLAP.
Data Mining is a massive, incomplete, and noisy process. Fuzzy and random data is used to extract potentially useful information and knowledge hidden in it that people do not know beforehand. Data mining can be seen as a data search process. It does not have to make assumptions or ask questions in advance, but still can find unexpected but interesting information, this information indicates the relationship and pattern of data elements. It can explore the pattern of a data key and find the most valuable information and knowledge ). Guides business behavior or assists in scientific research. The study targets large and ultra-large data sets.
Dr. William Inmon, a famous American Information Engineering expert, proposed the concept of data warehouse in early 1990s. He believes: "a data warehouse is usually a topic-oriented, integrated, time-varying, but relatively stable data set of information, which is used to support the management and decision-making process."
The theme refers to the key aspects that users are concerned about when using data warehouses for decision-making, such as revenue, customers, and sales channels, the information in the data warehouse is organized by topic, rather than organized according to business functions as in the business support system.
The so-called integration means that the data warehouse information is not simply extracted from various business systems, but is processed, sorted, and aggregated in a series of processes, therefore, the information in the data warehouse is consistent global information about the entire enterprise.
The so-called time-based changes indicate that the information in the Data Warehouse does not only reflect the current state of the enterprise, but records information from a certain time point in the past to the current phase. With this information, you can make a quantitative analysis and prediction of the enterprise's development history and future trends.
The information itself is relatively stable, that is, once a data enters the data warehouse, it is rarely modified, and more is to query the information.
According to the above definition, some people may simply consider a data warehouse as a large data storage mechanism and a static concept. In fact, a data warehouse is more like a process that involves data collection, sorting, and processing, generating the information required for decision-making, and finally providing the information to users who need the information, for them to make correct decisions to improve their business operations. The key and requirement of a data warehouse is that it can accurately, securely and reliably retrieve data from the business system and convert the data into regular information for analysis and use by management personnel. Therefore, Data Warehousing is a dynamic concept. It is called Data Warehousing ).