Before doing data mining for biee, you must take a look at these concepts and learn from them! (All the terms and explanations that I feel important are marked in red .)
1. Basic Concepts
Data warehouse and data warehouse technology are based on multi-dimensional data models. This model regards data as data cubes. Multidimensional Data Models are organized around the central topic. This topic is represented by a fact table. Facts are measured in numerical values.
Cubes allow modeling and observation of multidimensional data. It is defined by dimensions and facts. Dimensions are the perspectives or points of view that an organization wants to record. Each dimension has a table associated with it, which is called a dimension table. A dimension table describes the attributes of a dimension. A fact is a data measurement. It is a numerical measurement of the data to be investigated. A fact table includes the name or measurement of the fact and the keywords of each related dimension table.
A cube of n-dimensional data is called a basic cube. Given a set of dimensions, we can construct a cube lattice. Each grid displays data at different aggregation levels or different data subsets. The Cube is called a data cube. The 0-dimension cube stores the top-level summary, called the vertex cube, and the cube that stores the bottom-level summary is called the basic cube.
2. Multi-Dimensional Data Model:
A. Star Schema: The fact table is in the center and is surrounded by a dimension table (one for each dimension). The fact table contains a large amount of data without redundancy.
B. snowflake schema: a variant of the star schema. Some dimension tables are normalized (redundant fields are represented by a new table ), therefore, the data is further decomposed into an additional table, and the result format is similar to that of Snowflake.
C. Fact constellation (fact constellations): multiple fact tables share dimension tables. This mode can be seen as a star mode set. Therefore, it is called Galaxy schema or fact constellation.
3. OLAP operations on Multidimensional Data Models
A. Roll-Up: Summarize data
The concept of one dimension is upgraded in layers or implemented through dimension conventions.
B. drill-down: Roll-down Inverse Operation
More detailed data is obtained from less detailed data, which can be achieved by layering the dimension concept or introducing a new dimension.
C. Slice and dice: projection and selection)
D. Pivot: relocate, visualize, or convert a three-dimensional cube into a two-dimensional plane sequence. Transform the coordinate axis.
E. drill_drill SS: executes queries involving multiple fact tables.
F. drill_through: uses the relational SQL mechanism to drill to the bottom layer of the data cube and to the backend relational table.