Data Warehousing"

Source: Internet
Author: User
Tags dbase

I first learned about Data Warehousing-"getting your head off." Here I will talk about some of my new understandings and opinions, I hope this will be a simple introduction to those new beginners, and I hope it will be helpful.

physical layout of data warehouse

Speaking of the data warehouse, let's take a look at its background. Since the rise of dbase iii (dBase was a database management program developed in the dos era, which first belongs to Borland), database applications have entered personal computers, so that small and medium-sized enterprises have entered the commercial database application stage. At this time, database applications mainly record data. We call this type An OLTP (online transaction processing) database system. It is mainly designed for transaction processing, so it is convenient to perform corresponding transaction processing. For example, it supports the creation or modification of data by a large number of users.
However, due to the increasingly diversified requirements of data processing, managers need to make decision-making analysis on the data and frequently access a large amount of historical data, although administrators who have experience and skills in query syntax in the traditional database structure can implement it through complex statements, the performance and security of these statements are not practical. In addition, operations of olap in data warehouse in reality, most data information in an enterprise is distributed in different systems. Therefore, if comprehensive analysis is performed, information may be obtained from different data sources of the same type, this makes practical application difficult.

To solve the above problems, "Data Warehouse" came into being. According to bill inmon, the father of the data warehouse: Data Warehouses extract, process, and aggregate enterprise transactional data into useful information for enterprises. They are subject-oriented, stable, and time-varying, it is mainly used for decision-making support database systems.

performing a data warehouse audit

Simply put, the data in the data warehouse is read-only and retrieved from the transaction processing system at a given interval and re-constructed based on the analysis and Query Needs.
From the data perspective, he has the following features:
1. Integrate scattered heterogeneous data into a single data source.
2. Use a structure suitable for analysis and query to store data.
3. The included data is new data after transaction conversion, which is convenient for decision makers to analyze.
4. Data stability. Generally, new data is input at intervals, unlike regular job operations.
5. filter out the data related to analysis and save the data.
6. Data in a data warehouse is historical and often recorded over the years.
7. The data here is a historical record of the past, so it does not need to be changed once entered. You only need to enter new data on a regular basis.
In terms of name, a data warehouse is a "warehouse" with a large collection of data. Because the scale of each application department is different from the actual needs, the Data Warehouse can be divided: △data warehouse (the standard type of data warehouse application is constructed by the entire enterprise)optimizing etl processes in data warehouses
Delta data supermarket (the scale is relatively small and suitable for departments in the enterprise. The design should be taken into consideration from the perspective of Data Warehouse Establishment)

Delta multi-level data warehouse (a combination of data warehouse and data supermarket, the data supermarket in the lower layer is obtained by the data warehouse in the upper layer. This ensures data consistency and reduces the burden on data warehouses)

peoplesoft data warehouse

△Combined Data Warehouse (applicable to the scheme of integrating data supermarkets into data warehouses in the future)
About the structure, we all know that the relational database Model is "E-R Modal", while the data warehouse uses the "dimension Model", and in this dimension Model, the most common data warehouse structure is the star structure and its extended snowflake structure.
( A dimension is the classification information of an organization's data warehouse, such as the time, address, and owner.)
The structure is as follows:




A star-shaped structure is a relational database structure in which a fact table is in the middle of the model, surrounded by dimension tables. Data is maintained in fact tables, and Dimension Data is maintained in dimension tables. Each dimension table is directly associated with a fact table using a keyword.
This star structure is very suitable for the design of data warehouse databases. The reason is as follows:
1. This design facilitates rapid modification and addition when the data warehouse grows or the application changes, so it is flexible enough.
2. It is easy for developers and end users to understand.
3. Imitating typical methods for end users to consider and use their business data.
4. It is easy to implement as a physical database, and many DBMS can identify and optimize this structure, so this design structure can produce efficient queries.



We can see from the figure that this structure is an extension of the star structure, but adds the auxiliary details. A dimension table stores normalized data. This structure improves query performance by reducing the number of disk reads. A dimension table is decomposed into a primary dimension table directly associated with a fact table and a secondary dimension table associated with the primary dimension table.
The above briefly introduces some basic knowledge about Data Warehouse. For the design and management of data warehouse, I think it is a very practical job and requires not only sufficient theoretical knowledge, more needs to be constantly explored and summarized in the application. As a beginner who has just been familiar with data warehouse, I have learned more about it. I have talked about this here. I hope that my friends will not laugh. If you have a better understanding and experience, I hope you can share with us. I believe that good communication is our "high-quality calcium tablets". :)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.