In the business intelligence solution of multidimensional analysis, common models can be divided into star and snow pattern models based on the relationship between fact tables and dimension tables. When designing a model for logical data, consider whether the data is organized according to the star or snowflake model.
When all dimension tables are directly connected to the "fact table", the entire graph is like a star, so this model is called a star model, 2.
The star schema is an informal structure. Each dimension of a multi-dimensional dataset is directly connected to a fact table and there is no gradient dimension. Therefore, data is redundant, such as in a region dimension table, country
Two records, City C of province a and City D of province A, are stored twice for country A and province B respectively, that is, redundancy exists.
Figure 2. Star Model in sales data warehouse
When one or more dimension tables are not directly connected to the fact table, but are connected to the fact table through other dimension tables, the diagram is like a snowflake model. The snowflake model is a star model.
Type Extension. The dimension table of the star model is further layered, and the original dimension tables may be expanded to small fact tables to form some local "layers"
Partition. All the decomposed tables are connected to the primary dimension table instead of the fact table. From 2 to 3, the region dimension table is decomposed into the dimension table of country, province, and city. It has the following advantages:
Query performance can be improved by minimizing data storage and combining small dimension tables. Snowflake structure remove data redundancy
Figure 3. snowflake model in sales data warehouse
Because of data redundancy, many statistical queries do not require external connections. Therefore, the star model is generally more efficient than the snowflake model. Design and Implementation of a star structure without considering many normalization factors
Are relatively simple.
Due to the de-redundancy of the snowflake model, some statistics need to be generated through table join, so the efficiency is not necessarily high in the star model. Normalization is also a complicated process, with the corresponding database structure design and data
Data ETL and later maintenance are complicated. Therefore, when redundancy is acceptable, the star model is used more efficiently.