In the past, the dimension tables, fact tables, data analysis, BI and other concepts have some ambiguity. These days of study finally let these have some clues:
A dimension represents the amount of data you want to analyze , such as when you analyze a product's sales, you can choose to analyze it by category , or by region . Such a press. The analysis forms a dimension. The previous example can have two dimensions: type and region. In addition, each dimension can have sub-dimensions (called attributes), such as attributes that can have subtypes , product names , and so on. Here are two common dimension table structures:
Product dimension table: prod_id, Product_Name, Category, Color, Size, Price Time dimension table: TimeKey, Season, year, Month, Date
The fact table is a result table that is generated from a dimension after data aggregation. Examples of its structure are as follows:
Sales Fact Sheet: prod_id (Reference Product dimension table), TimeKey (Reference Time dimension table), SalesAmount (total sales, in currency), Unit (volume)
The tables above are present in the Data Warehouse. From here you can see that it has several features:
1. The redundancy of the dimension tables is large, mainly because the dimensions are generally small (relative to the fact table), and the redundancy of the dimension tables can save a lot of space in the fact table. 2. Fact sheets are generally very large, and if queried in an ordinary way, the time to get the results generally is not acceptable to us. So it usually has to do some special processing. As in SQL Server 2005, the fact table is processed such as pre-build processing. 3. The primary key of the dimension table generally takes the type of the flag column of the integer value, which is also to save the storage space of the fact table.
Dimension table, fact table, Data Warehouse, BI ...