OLAP introduction (combined with personal work), olap introduction combined with personal

Source: Internet
Author: User

OLAP introduction (combined with personal work), olap introduction combined with personal
OLTP and OLAP
Traditional database systems are OLTP and can only provide raw data operations. Analysis is not supported.

OLTP system: Performs online transaction and query processing. General supermarket invoicing system, functions: Registration, accounting, inventory and sales records, etc,

OLAP system: data analysis and decision-making service, which organizes data in different formats to meet the needs of different users.

Differences:

Oriented. OLTP targets customers, that is, operators, such as supermarket cashiers and bank counters. OLAP is market-oriented and used for data analysis. Analysts include data analysts, business managers who make decisions, or policy formulation.

Data content: OLTP current data. Summarize and aggregate historical OLAP data.

Database Design: ER Model for OLTP and application database. OLAP uses a star or snowflake model for Topic database design.

There is also an Access Mode: The difference between operational transactions and read-only analysis and computing.

And so on.

Multidimensional Data Model:

Cube:

Each subset of a given dimension generates a cuboid (called a cube ). In this way, the data can be displayed at the aggregation level or group by at different granularities, And the content above the entire body becomes cube.

The cube summarized at the lowest layer is called the basecuboid ). After a summary of a dimension appears, it is not a basic cube.

The data aggregated to the highest level is called the vertex cube (apexcuboid), for example, 0-D cube, that's to say. All dimensions are aggregated with only one cuboid, and no more summary is allowed.

Vertex cubes are the most generalized cubes. The basic cube is the lowest specialized cube.

Width granularity is a summary of different degrees, involving operations:

Roll up (roll up), which is called drill-up drillup by the supplier.

Drill-down (drill down) is layered down along the concept of dimension. You need to find more fine-grained data.

Slice: fixed the value of a dimension and extracted a subset of the dimension.

Slice: multiple values are selected from multiple dimensions to extract the child cube mapped to them.

Rotating rotate is also called the rotate number Axis Transformation. To put it simply, the rows and columns in the rotate table are transposed. It is more complex than three dimensions, and the positions of different numbers are transformed. It is said that the data view angle is changed.

 

Concept hierarchy: low-level concepts (such as cities) are mapped to higher-level concepts (such as countries ). From low to high is called generalization, and from high to low is called specialize ).

Schema hierarchy refers to the full or partial order of attributes in the database schema.

Sets-grouping hierarchy discretization or grouping of attribute values of a given dimension. For example, the age attribute is divided into three subsets: young, mid, and old, and the group by sex is the subset of men and women.

 

 

Implementation of Data cubes:

The data warehouse model is a multi-dimensional model. Currently, the following models are often used:

Star Model: A large, comprehensive, non-redundant fact table (fact), and dimension tables (dimension) in different analysis dimensions ). A dimension table is associated with a fact table based on its dimension key (values in all possible ranges.

Snowflake model: further refinement of the star model, which standardizes dimension tables with multiple values (that is, extracting a value contained in the dimension table as a new dimension table ), to reduce redundancy.

In this way, data is further decomposed into additional tables, which is easy to maintain and saves space (to prevent dimensional disasters). However, more association operations are required during query to reduce timeliness.

Fact constellation or galaxy schema: multiple fact tableshare all dimesioms (shared dimension table ).

For example, I designed data warehouse. Workbench

Cube definition

Dimension Definition

Generally, fact constellation is used for data warehouse.

 

Index

Measure

 

When there are too many dimensions (feature spaces are very complex), there will be a lot of associated computing between dimensions, and the hierarchical dimension concept will aggravate the disaster. In the cube, the computation of different dimensions produces huge data, that is, the pre-calculation of all cubes (subcubes) in the cube, and the storage space seems to increase. In N-dimensional mode, there will be 2n subcubes. In addition to the concept layered Li, the total number of cubes

 

 

Pre-calculation: 1 (no materialization) 2 (full materialization) 3 (partial materialization)

 


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.