SSAs series-[03] multi-dimensional data (multi-dimensional dataset object)

Source: Internet
Author: User

1. What is cube?

A simple cube object consists of basic information, dimensions, and measurement value groups. The basic information includes the name, default measurement value, data source, and storage mode of the cube. A dimension is an actual dimension group used in a multi-dimensional dataset. All dimensions must be defined in the dimension set of the database before being referenced in the multidimensional data set. A measurement value group is a measurement value set in a multi-dimensional dataset. A measurement value group is a set of metric values with common data source views and dimension sets. A measurement value group is the processing unit of a measurement value. You can process a measurement value group separately before browsing. This concept msnd is very clear and easy to understand. Just copy it.

2. measurement value and measurement value group

Measurements are usually mapped to columns in fact data tables. Attribute columns in a dimension table can be used to define measurement values. However, these measurement values are semi-cumulative or non-cumulative in aggregate behavior.

Graph fact data table metric Value

A simple measuregroup object consists of basic information, metric values, dimensions, and partitions. The basic information includesName, Metric ValueType,Storage ModeAndProcessing mode. A metric value is a set of actual metric values that constitute a measurement value group. For each measurement value, there is an aggregate function definition, a format attribute definition, a data item source definition, and other definitions. A dimension is a subset of dimensions used to create a processed metric value group. A partition is a set of physical shards of a processed metric group. In a multi-dimensional dataset, metric values are grouped into multiple measurement groups based on their basic fact data tables. A measurement value group is used to associate dimensions with measurement values. A measurement value group can also be used to take non-repeated counts as the measurement value for its aggregation behavior. You can optimize aggregation by placing each non-repeated count metric value in your measurement value group.

3. granularity and Aggregate functions

In OO, if the implementation of a target "Fleet" only goes deep into the "Automobile" level, rather than the "engine" and "tire" level, the former is more granular than the latter.
As long as the requirements can be met, the larger the granularity, the better, the simpler and more practical. The concept of granularity in a data warehouse is the same as that in a data warehouse.

Aggregate functions are used for data statistics of database tables, such as sum, count, and AVG. Needless to say, the aggregation behavior of each Metric value is determined by the aggregate function associated with this metric value.

4. Dimension relationship

It refers to the attributes of a dimension and fact data table at a certain dimension granularity.

Regular dimension relationship

There is nothing to say about the star model ,:

Application Model relationship

Snowflake model ,:

Multi-to-Multi-Dimensional Relationship

Fact dimension relationship

A fact dimension (usually called a degraded dimension) is a standard dimension constructed from attribute columns in a fact data table rather than a dimension table. Useful Dimension Data is sometimes stored in fact data tables to reduce duplication.

5. Computing

Finally, I am very interested in this topic. It should be one of the core operations of the data warehouse. Computing is a multidimensional expression (MDX)Expressions or scriptsDefines the computing Member, nameset, or range allocation in a cube in SSAs. In my understanding, after calculation, a new measurement value can be generated based on some original measurement values. The new measurement value will be used with the original one when you browse the cube later. Using script commands can make computing complex and flexible to meet business needs.

6. Key Performance Indicators

In business terms, key performance indicators (KPIs) are measurable measurements used to determine business performance. A simple KPI consists of basic information, goals, actual values, Status values, trend values, and folders of KPIs viewed in it. For example, a sales department in a unit can use the gross profit of each month as the KPI, but the HR department in the same unit can use the employee lost on a quarterly basis as the KPI. This is a very important indicator. Currently, we have understood this degree.

7. Action

The main purpose of the operation is to provide the stored MDX statements to client applications that can be used by client applications. It is the execution command of the client defined on the server side. My understanding is similar to the stored procedure in the database engine.

8. Partition

Partition is the container of some metric group data. A simple partition object consists of basic information (name, storage mode, and processing mode), slice definition, and aggregation design. SSAs uses partitions to manage and store data and aggregation of metric value groups. Partitions are invisible to business users of multidimensional datasets. Allows the source and aggregate data of a cube to be distributed across multiple hard drives and multiple server computers. Partition and greatly improve query performance, load performance, and the maintainability of multidimensional datasets. The partition structure must match the structure of the measurement value group.

Partitioned Storage

Molap, ROLAP, holap

Active cache (partition)

Multiple Active cache configuration options are provided, allowing you to maximize performance, minimize latency, and arrange processing.

Remote partitioning

Data in remote partitions is stored on another Microsoft SQL Server Analysis Services instance, but not on instances that contain partition definitions (metadata) and their parent cubes.

Writable partitions

It is said that the query performance can be improved after partitioning is used. How can this performance be improved?

9. pivoting

Pivot is a read-only view of a multi-dimensional dataset. Pivot controls the visibility of objects contained in a multi-dimensional dataset. The following objects can be displayed or hidden in a perspective:

Dimensions, attributes, hierarchies, metric groups, metric values, key performance indicators (KPIs), computing (computing members, namespaces, and script commands), and operations.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.