Data granularity in Data Warehouses

Source: Internet
Author: User

Granularity is one of the most important aspects of data warehouse design. Granularity refers to the level at which the data stored in the data unit of the Data Warehouse is refined or comprehensive. The higher the level of refinement, the smaller the level of granularity. On the contrary, the lower the level of refinement, the larger the level of granularity. Determining granularity is an important design issue for data warehouse developers. If the granularity of the Data Warehouse is determined reasonably, other aspects of the design and implementation can be smoothly carried out. Otherwise, if the granularity is determined unreasonably, it will be difficult for all other aspects. Granularity is very important for data warehouse architecture designers, because granularity affects all the environments of data warehouses that depend on the data obtained from it.

The main problem of granularity is to make it at a proper level. The level of granularity cannot be too high or too low. Low granularity can provide detailed data, but requires a large amount of storage space and a long query time. A high level of granularity can be used for fast and convenient query, but it cannot provide too much data. In the process of selecting an appropriate level of granularity, we should consider the types of analysis, the overall storage space, and other factors based on the characteristics of the business.

The granularity model in a data warehouse refers to the level of detail or synthesis of data units in a data warehouse, the time period parameter used to record or synthesize data in a data warehouse (data warehouse and data mining). It determines the time details and level of the data units stored in the data warehouse. The granularity can be divided into two types. The first type of granularity is a measure of the degree of data synthesis in the data warehouse, and the amount of data in the data warehouse, it also affects the types of interrogation that the Data Warehouse can answer. The smaller the granularity, the higher the level of detail, the lower the overall level, and the more types of answers to questions. On the contrary, the larger the granularity, the lower the overall level, the fewer types of questions are answered. Another form of granularity is the sample database granularity, which is different from that in the same city. The sample database granularity is not divided based on different overall sequences, but based on the sampling rate. Sample databases with different sampling granularities can have the same overall level. A sample database is a type of database that extracts data from detailed archives or mild comprehensive data at a certain sample rate. It obtains a sample from the data source based on certain requirements, so it cannot answer some detailed questions. Sample Database extraction can be performed based on the importance of data. Articles are organized on the Internet. If any errors occur, please note.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.