Data Warehouse (Data warehousing)-"set off your Hijab" I'm here to say some of my beginner's understanding and views, I hope these can be as a simple introduction to those novice friends, but also want to play a good effect.
When it comes to data warehousing, let's take a look at the background he's producing. Since Dbaseⅲ (dBASE is the development of the DOS era of the database management program, first attributed to the rise of the Borland), the application of the database into the PC, so that small and medium-sized enterprises have entered the commercial database application stage. At this time the database application is mainly to record the data, we call this kind of OLTP (online transaction processing) database system. He is mainly designed for transaction processing, so it is easy to do the appropriate transactions, such as: to support a large number of users to create or modify data operations.
But as the diversity of data processing is becoming more and more obvious, managers need to make a decision analysis of the data, frequent access to a large number of historical data, although the traditional database structure experienced and proficient in the query syntax of the administrator can be implemented through complex statements, but in terms of their performance and security, It seems impractical. And, in reality, the data information in the enterprise is mostly distributed in different systems, so if comprehensive analysis and processing, it may need to obtain information from different types of data sources, which brings difficulties to practical application.
In order to solve the problem, "Data Warehouse" arises. According to the father of the Data Warehouse, Bill Inmon's words: Data Warehouse is the enterprise's business data refining, processing and integration for the enterprise useful information, is subject-oriented, stable, with time changes, mainly for decision support database system.
Simply put: Data warehouse data is read-only, from the transaction processing system at a given interval of time, and based on the analysis of the needs of the query to reconstruct.
From the point of view of data, he has the following characteristics:
1. Integration of decentralized heterogeneous data into a single data source.
2. Use the structure of the appropriate analysis query to store the data.
3. The inclusion of data is a transactional conversion of new data to facilitate the analysis of decision-making personnel.
4. Data is stable. Typically, the newer data is entered at intervals, unlike regular job operations that often change.
5. Filter out the data related to the analysis to save.
6. Data in the Data warehouse is historical and often data recorded over the years.
7. The data are historical records of the past, so once the input does not need to change, only the regular entry of new data can be.
Data Warehouse from the name is a large number of data collection of "warehouse", because of the size of the application departments and the actual needs of different, the data warehouse can be divided into: Δ Data Warehouse (Data Warehouse application of the standard type, is based on the entire enterprise structure)
Δ Data supermarket (size relative to the smaller, suitable for the department of the Enterprise, design should be considered from the point of view of the formation of the Data Warehouse)
Δ Multi-tiered Data warehouse (data warehouse and data supermarket synthesis, the lower data supermarket by the upper level of the data warehouse to obtain information.) This keeps the data consistent and reduces the burden on the data warehouse.
Δ Federated Data Warehouse (for scenarios that are intended to integrate data supermarkets into data warehouses in the future)
As for the structure, we all know that the model of relational database is "E-Modal", the Data Warehouse uses the Dimensions model, and in this dimension model, the most commonly used data warehouse structure is the "star" structure and its extended "snowflake" structure.
(Dimensions are classified information about the organization's data warehouse data, such as time, address, person in charge, and so on.) )
The structural drawings are as follows:
The star structure is a relational database structure in which the fact table is surrounded by a dimension table, the data is maintained in the fact table, and the dimension data is maintained in the dimension table. Each dimension table is directly associated with the fact table through a single keyword.
This type of star structure is very suitable for Data Warehouse database design. The reasons are as follows:
1. This design is flexible enough to be easily modified and added when data Warehouse growth or application changes.
2. Easy to understand for both developers and end users.
3. Simulates the typical way in which end users consider and use their business data.
4. Easy to implement as a physical database, and many DBMS can recognize this structure and optimize it, so this design structure can produce efficient query.
As you can see, this structure is an extension of the star structure, just adding auxiliary details. The dimension table stores the normalized data, which improves query performance by reducing the number of disk reads. A dimension table is decomposed into a primary dimension table that is directly associated with the fact table and a secondary dimension table associated with the primary dimension table.
The above simple introduction of some basic knowledge of data Warehouse, for the design and management of data Warehouse, I think it is a very practical work, not only need sufficient theoretical knowledge, but also in the application of continuous exploration and summary. As a novice to the data warehouse, the understanding of these, has more learning passion. Here to talk about these, hope Master friend don't laugh. If you have a better understanding and experience, I hope you can communicate with each other. I believe that good communication is our growth of "high-quality calcium." :)
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.