Dimension design of a property of small knowledge of data warehouse design

Source: Internet
Author: User

We often encounter this problem in the design of the Data Warehouse: If the dimension has only one attribute in the dimension design, is the choice to create a single dimension for this attribute, or will the attribute of that dimension be placed directly in the fact table as part of the fact table?

Suppose there is a dimension, which is usually designed with at least two columns (Dimkey and Dimattribute properties), and the fact table is associated to this dimension through Dimkey. First of all, in the query phase of a multi-table JOIN relationship comparison of single-table query is certainly less efficient, we look at the following example:

CREATEdim_table (Dim_keyINT PRIMARY KEY IDENTITY(1,1), Dim_attrNVARCHAR( -))CREATEfact_table (Dim_keyINT FOREIGN KEY REFERENCESdim_table (Dim_key), MEASUREDECIMAL( -,2))

The query for a typical star structure is as follows:

SELECT d.dim_attr,        SUM  as  Total  from  as F INNER JOIN  as D  on = D.dim_key GROUP  by D.dim_attr

If you put this attribute directly in the FACT table, the results and queries are as follows:

 create  table   Fact_table_2 (dim_attr  int  foreign  key  references   Dim_table (Dim_key), MEASURE  decimal  (18 , 2   Span style= "color: #0000ff;" >select  sum  (MEASURE) as   Total  from   fact_table_2  group  by  dim_attr 

Our queries and aggregations are much simpler and more efficient in terms of query efficiency. But why do we usually choose to put this individual attribute or the dimension table, here are a few reasons we need to consider:

1. If the fact table is very large, using the Dim_key INT type 4 Bytes with respect to Dim_attr's NVARCHAR (20) type can significantly reduce the volume of the fact table.

2. If this attribute value changes in the source business system, it means that we want to update all the property values associated with the property in the fact table.

3. It is possible today that the dimension does have only one attribute, but who can ensure that the dimension does not add any other related attributes?

Data Warehouse design is an iterative development process, the development of a year, maintenance for several years, if we can consider the above reasons, we can clearly consider whether in the design phase of the need to select a single attribute as a dimension to design.

For more bi articles, see the Bi-series essay list (SSIS, SSRS, SSAS, MDX, SQL Server) If you feel this article has helped you, please help recommend it to make it easy for others to see these articles quickly in the Biwork blog recommendations Bar.

Dimension design of a property of small knowledge of data warehouse design

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.