Data Warehousing Special Topic (3)-thinking on the design of the fact table of distributed data Warehouse

Source: Internet
Author: User

First, preface

Recently in the design of data warehouse data logical model, taking into account the large-volume data storage in the Distributed Data Warehouse technology architecture model, the traditional face-relational data warehouse data storage model needs to be reformed. A set of data storage models that are truly suitable for distributed data warehouses are designed.

Second, the basis of the fact table design

The fact table records the operational events that occur in the real world, and the resulting degrees of value . The fact table is designed to be completely dependent on physical activity and is not affected by the resulting report. In fact tables, in addition to numeric measures, fact tables always contain foreign keys , which are used to correlate dimensions associated with them, or to include degenerate dimension keys and date/time stamps .

Iii. Traditional Models

In the case of the FS-LDM data storage Model event topic domain data storage model design, its event topic domain data logical model structure is as follows:

The main problems of the traditional model are as follows: if there is a small amount of data, it is not a problem to perform multi-table associations, but when a single table stores massive amounts of data in a distributed data warehouse, it is obvious that the pattern will face challenges.

Iii. a new principle of distributed model-dimension modeling

  (1) By the value of the key: The key value of the unique dimension table, unless necessary, otherwise not introduce a dimension table, such as the IP Address dimension table, using IP as the primary key of the dimension table, the fact table stores the IP value;

(2) Reasonable sub-table: Traditional relational data Warehouse has multiple table integration impulse, such as event fact table, various acount ind,finance Ind, etc., used to extend the universality of the table, and try to store all the data in a single table. Distributed Data Warehouse design, on the contrary, because of the single-table data size problem, if you want to meet the performance of analysis and processing, reasonable data in accordance with the business of the table storage. such as financial related events, account-related events, separate into a table. More conducive to data calculation and analysis.

Iv. examples of distributed dimension models

serial number field English field Chinese field explanation
1 event_id Event ID Record identification   Identity
2 Tm Time Time stamp Dimension of
3 Domain Domain   Dimension-Value
4 Ip IP Address IP Address     Dimension-Value
5 Os Operating system Operating system     Dimension-Value
6 user_id User ID User ID     Dimension-FK
7 Date Date Date (new)     Dimension-Value
8 From Source Accessing the source system     Dimension-Value

 V. To be continued before the end

Distributed Data Warehouse data storage model design, follow-up will continue to update, please pay attention to QQ Group: Distributed Data Warehouse modeling 398419457.

Data Warehousing Special Topic (3)-thinking on the design of the fact table of distributed data Warehouse

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.