Perhaps many people understand that the data warehouse is built on the basis of multidimensional data model for OLAP data platform, through the previous article-the basic architecture of the data Warehouse, we have seen that the Data Warehouse application may be far more than these. However, we have to admit that multidimensional data model is a major feature of data warehouse and an important aspect of data Warehouse application and implementation, which makes it more suitable for analytic data query and retrieval by optimizing data organization and storage.
Definition and role of multidimensional data model
The multidimensional data model is a database model based on fact and dimension to meet the needs of users to query and analyze data from multi-angle and multi-level, and its basic application is to realize OLAP (Online Analytical Processing).
Of course, through the multidimensional data model of data presentation, query and access is its role in the presentation, but its real role in the realization is that through the data warehouse can be based on different data requirements to establish a variety of multidimensional models, and the formation of a data mart open to different user groups to use, That is, a variety of data items tailored to your needs are placed in a data mart for different data consumers to purchase.
Multidimensional Data Model instances
Before looking at an instance, there are two concepts that need to be understood first: the fact table and the dimension table . Fact tables are used to record specific events, contain specific elements of each event, and what happens; a dimension table is a descriptive information about the elements of an event in the fact table. For example, an event will contain time, place, person, event, fact table records the entire event information, but on the time, location and people and other features only recorded some key markers, such as the protagonist of the event called "Michael", then Michael in the end "what", you need to go to the corresponding dimension table to query " Specific description of Michael ". Based on the fact table and the dimension table, many multidimensional models can be built, including star model, snowflake model and constellation model. It's not going to unfold anymore, explaining the concept is really cumbersome, and based on my understanding of the description is not necessarily everyone can understand, or directly on the example:
This is an example of the simplest star model. The fact table contains two main information: peacekeeping metrics , the specific descriptive information of the dimension is recorded in the dimension table, the dimension attribute in the fact table is only a key associated to the dimension table, does not record the specific information, the measure will generally record the corresponding value of the event, such as the number of products sold here, sales and so on. The information in the dimension table is generally can be layered, such as the time dimension of the month and day, the geographical dimension of the provinces and cities, and so on, this kind of hierarchical information is to meet the fact that the measurement in the table can be done in different granularity of aggregation, such as 2010 sales of goods, from Shanghai sales and so on.
It is also important to note that the update frequency of the dimension table is not high or relatively stable, such as the time dimension of an established decade is not required to be updated in the short term, and the geographical dimension is also; but the data in the fact table is constantly being updated or increased, as events continue to occur, and users are constantly buying goods, Accept the service.
Advantages and disadvantages of multidimensional data model
The multidimensional model mentioned here refers to the multidimensional data model based on relational database, which has its own advantages and disadvantages compared with the traditional relational model.
Advantages:
The most important advantage of the multidimensional data model is its analysis-optimized data organization and storage mode. as a simple example, the operations database of an e-commerce site may be recorded at a certain point in time, a user buys a product, and sends a collection of such records to a specific address, so we cannot immediately obtain 2010 years of July exactly how many users purchased the goods, Or 2010 years in July How many people in Zhejiang Province purchased the goods? But on the basis of multidimensional model, this kind of query becomes simple, as long as the data in the time dimension is aggregated into the 2010 July, at the same time in the geographical dimension on the data aggregation to the granularity of Zhejiang province can be achieved, this is the concept of OLAP, then there will be related articles to introduce.
Disadvantages:
The disadvantage of multidimensional models is that they are less flexible than relational models and can be difficult to change once the model is built. such as the fact of an order, where the user may purchase a variety of goods, including the time, the number of user peacekeeping products, the total price and other measures, for the relational model if we need to distinguish between the order of the products contained in, we only need to build another table to record the order number and the correspondence of the product, But in a multidimensional model, once the fact table is built, we cannot split an order record in the fact table, so we cannot build a new dimension-product dimension, but also build a product-themed fact table.
Therefore, before building a multidimensional model, we will generally be based on the requirements of the detailed design of the model, which should be included in the measurement of the size of the data should be kept to what granularity to meet the needs of users.
The multidimensional model of Data Warehouse is introduced briefly, do you think that when you analyze the data, a lot of data is the structure of multi-dimensional model, or you have used your own method to build a multidimensional model or the realization of the multidimensional display of data, welcome to share with me.
Multidimensional data model of Data Warehouse