Data Warehouse design without primary key foreign key

Source: Internet
Author: User

Prior to the deployment of the company BI Project example, found that the database tables are not set the primary key, foreign keys, has been thought to be a simulation project, the reason for not rigorous requirements. Today we know that the Data Warehouse is not designed for primary and foreign keys. These constraints should be done when ETL is programmed to ensure that all data that satisfies the constraints of the data source can flow into the Data warehouse.

Http://stackoverflow.com/questions/21288549/why-primary-key-is-not-required-on-fact-table-in-dimensional-modelling


Primary Key is there... but enforcing the PRIMARY KEY constraint on database level is not required. If you think about this, technically a unique key or primary key was a key that uniquely defines the characteristics of EAC H row. And it can be composed of more than one attributes of that entity. Now in the case of a Fact table, foreign keys flowing-in from the other dimension tables together already act as a Compoun Ded primary key. And these foreign-key combinations can uniquely identify each record in the fact table. So, this foreign key combination are the primary key for the fact table. Why isn't a surrogate Key then?Now if you wanted, you could has defined one surrogate key for the fact table. But what purpose would that serve? You is never going to retrieve one record from the fact table referring its surrogate key (use Indexes instead). Neither you is going to use this surrogate key to join the fact with other tables. Such a surrogate key would be completely waste of space in the database. Enforcing Database ConstraintsWhen your define this conceptual primary key in the database level, the database needs to ensure that this constraint are not GE Tting violated in any of the DML operation performed over it. Ensuring this constraint are a overhead for your database. It might is insignificant for the OLTP system, but for a large OLAP system where data is loaded in batch, this may incur s Ignificant performance penalties. Beside, why does want your database to ensure the integrity of the constraints when you can ensure the same during the D ATA loading phase itself (typically through your ETL coding).


The primary key is not required because the combination of each dimension of the record has uniquely defined the record.

The surrogate key is not required because we retrieve the data with an index and do not use the surrogate key to correlate other tables.

There is no need for redundant constraints, because we prioritize that all data being cleaned out can flow into the warehouse. Ensuring that data is complete and consistent is ETL-encoded.

Data Warehouse design without primary key foreign key

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.