Data Mining learning notes Multidimensional Data Model-data cube

Source: Internet
Author: User
Tags snowflake schema

The multidimensional data model is a fact-and-dimension-based database model established to meet the needs of users to query and analyze data from multiple perspectives and layers, its basic application is to implement OLAP (Online Analytical Processing ).

Each dimension corresponds to one or a group of attributes in the mode, and each unit stores a certain clustering metric value, such as count or sum. Cube provides a multi-dimensional view of data and allows pre-calculation and quick access to summary data.

Data Mining: Concepts and Technology

Cubes allow modeling and observation of multidimensional data. It is defined by dimensions and facts.
Dimensions are the perspectives or points of view that an organization wants to record. Each dimension has a table associated with it, which is called a dimension table.
Fact tables include the names or measurements of facts and Keywords of each related dimension table.

In the research documents of data warehouse, a cube of n-dimensional data is called a basic cube. Given a set of dimensions, we can construct a cube lattice, each of which displays data at different aggregation levels or different data subsets. The Cube lattice is called a data cube. The 0 dimension cube stores the summary of the highest level, which is called the vertex cube. The cube that stores the summary of the lowest level is called the basic cube.

 

Concept Model of Data Warehouse
The most popular concept of data warehouse is the multi-dimensional data model. This model can exist in the form of star, snowflake, or fact constellation.


1. Star Schema: The fact table is in the center and is surrounded by a dimension table (one for each dimension). The fact table package contains a large amount of data without redundancy.


2. snowflake schema: a variant of the star schema. Some dimension tables are normalized, so data is further decomposed into the additional table. As a result, the pattern is similar to that of Snowflake.

Compared with the constellation model, the snowflake model standardizes the dimension table.

 


Fact constellations: multiple fact tables share a dimension table. This mode can be seen as a constellation pattern set, so it is called Galaxy schema or fact constellation)

The fact constellation mode combines dimensions shared by facts.

The concept is layered to facilitate data aggregation.

 

OLAP operations on Multidimensional Data Models
Roll-Up: Summarize data
The concept of one dimension is layered up or


Drill-down: the inverse operation of the volume.
From less detailed data to more detailed data, it can be achieved by layering along the concept of dimension or introducing new dimensions.

Roll-up and drill-down 1 are shown.


Slice and dice)
Projection and selection operations


Rotating Shaft)
Relocate, visualize, or convert a 3-dimensional cube into a 2-dimensional plane Sequence


Other OLAP operations
Drill_drill SS: executes queries involving multiple fact tables.
Drill_through: uses the relational SQL mechanism to drill to the bottom layer of the data cube and to the back-end relational table.

 

Refer:

Data Mining: Concepts and technologies

Wang Can-Data Mining video tutorial

Multidimensional Data Model of Data Warehouse:Http://webdataanalysis.net/web-data-warehouse/multidimensional-data-model/

Data Mining learning notes Multidimensional Data Model-data cube

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.