Data Warehouse (6): Conceptual Design

Source: Internet
Author: User
ArticleDirectory
    • 1.1.1 define facts
    • 1.1.2 build an attribute tree
    • 1.1.3 trim and port the attribute tree
    • 1.1.4 define dimension
    • 1.1.5 define measurement
    • 1.1.6 generate fact Mode

Three basic system methods can be used in the data mart design: data-driven, demand-driven, and hybrid. They differ in the proportion of the source database analysis and end user demand analysis stages. The method selection will greatly affect the conceptual design approach.

Data-driven methods include Design Based on Object-link mode, Design Based on Link mode, and design based on XML mode. The conceptual entity-relational model is more expressive than the relational logic model. Therefore, the former is generally considered a better design source. However, the actual situation is that the company often cannot provide a precise and complete entity-link model (lost, incomplete documents, or other reasons ). Then we can only do it according to the logic mode of the database. On the other hand, most of the Web data is in XML format. The design based on the XML pattern can derive a data mart conceptual pattern from the XML source pattern.

1. Data-driven method design 1.1 Design Based on entity-Relational Model

The entity-relational model-based technology used in the design of a data mart concept that complies with the dimensional fact model (DFM) includes the following steps:

(1) define facts.

(2) For each fact:

A. Create an attribute tree.

B. Trim and port the attribute tree.

C. Define a dimension.

D. Define a measurement.

E. Create a fact mode.

Select related facts from the data source mode. Then, create an attribute tree in semi-automatic mode. This is a transitional structure that can be used to determine the boundary of the fact mode to clear irrelevant attributes and modify the dependencies linked to these irrelevant attributes (corresponding to step (2). B ).Attribute treeLink the data mart and Data Source mode. This link isKey to the data preparation process. Then, it is relatively easy to convert the attribute tree to the fact mode (step (2). E. Step A is based onAlgorithm; Step CDE is property-based. Steps 1 and B need to have a deep understanding of the company's business model.

1.1.1 define facts

Facts usually correspond to dynamic events in the company. In object-link mode,FactIt may correspond toEntityOr E1, E2,..., n yuan between en entitiesLinkR. For the latter, R can be converted to an entity (Materialized Process). To this end, add a new entity F and replace each branch of R with the binary relationship (RI) between F and EI. If min (E, A) and max (E, A) are usedMinimum base levelAndMaximum base level(Base level indicates that entity e participates in relationship a on the corresponding level. Generally, min (E, A) ε {0, 1}, max (E, A) ε {1, n}), then: min (F, RI) = max (F, RI) = 1, min (EI, RI) = min (EI, R), max (EI, RI) = max (EI, R ).

Note: sometimes different entities may be candidates for expressing individual facts. It is recommended that the entity selected as a fact should be an entity that constructs an attribute Tree Containing as many attributes as possible.

1.1.2 build an attribute tree

Attribute tree

Given a related part of the object-relational data source mode, and an object F that is classified as a fact, the attribute tree meets the following requirements:

 

    • Each node corresponds to a data source mode attribute (simple or composite attribute ).
    • Root corresponds to the identifier of the F object.
    • For each node v, all subsequent attributes corresponding to V are determined by the function.

 

1.1.3 trim and port the attribute tree

1.1.4 define dimension

1.1.5 define measurement

1.1.6 generate fact Mode

1.2 relationship-based Design

1.3 XML-based Design

2. Hybrid Method Design

3 requirement-Driven Method Design

References:

Data warehouse design: modern principles and methods Matteo golfarelli, Stefano Rizzi

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.