Data Warehouse topics and Subject fields

Source: Internet
Author: User

what is a data Warehouse topic

Since the study of Data Warehouse, the concept of "theme-oriented" of data warehouse is always vague, and the understanding is not profound and thorough. Last night opened the textbook to review, still can not understand its essence of thought, is very confused, and later from the Internet to find some information, carefully grinding.

1. The concept of a theme

The topic (Subject) is an abstract concept that synthesizes, classifies and analyzes the data in enterprise information systems at a higher level, each of which corresponds to a macroscopic analytical field. in the logical sense, it is the analysis object which is involved in a macro analysis field in the corresponding enterprise. For example, "Sales Analysis" is an analysis area, so the topic of this data Warehouse application is "Sales Analysis".

The topic-oriented data organization method is a complete and consistent description of the analytical object data at a higher level, which can describe the enterprise data involved in each analysis object, as well as the connection between the data. the so-called higher level is relative to the application-oriented approach to data organization, refers to the way the data organized by the topic has a higher level of data abstraction. The data in the Data Warehouse is organized by the topic, which corresponds with the characteristic of the traditional database oriented to the application. For example, a production enterprise's data Warehouse organizes topics that may include product ordering analysis and shipment analysis. The organization by application may be financial subsystem, sales subsystem, supply subsystem, human resource subsystem and production scheduling subsystem.

The subject is determined according to the requirements of the analysis. This is different from organizing data according to data processing or application requirements. As in the production enterprise, is also the material supply, in the operational database system, people are concerned about how to more convenient and faster processing of the material supply business, and in the analysis process, people should be concerned about the different procurement channels and materials supply timely, as well as material quality status.

The data warehouse targets the main subject areas of a company that has been defined in the data model. Typical subject areas include customers, products, orders and finances, or some other business or activity.

2. Getting the subject Domain

A subject field is the boundary of a topic that is determined after an analysis of a topic. The first step in information packaging technology is to analyze the topic domain and determine the topic to mount to the Data warehouse. When designing a data warehouse, it is generally a matter of creating a topic or a part of the enterprise's entire topic at a time, so that in most data warehouse design process There is a topic domain selection process. The determination of the subject domain must be done jointly by the end user and the designer of the Data Warehouse.

For example, for adventure Works cycle This type of company management needs to analyze topics that typically include vendor topics, product topics, customer topics, and warehouse topics. Among them, the content of the subject includes recording the purchasing situation of the supermarket goods, the sale of the goods and the storage of the goods; Customer topics include content that customers may purchase, and warehouse topics include storage of goods in warehouses and management of warehouses, as shown in 3-31.

Figure 3-31 Analysis topics determined by business conditions

Determining the subject boundary actually requires a further understanding of the business relationship, so after you determine the entire analysis topic, you need to make a preliminary elaboration of these topics to facilitate access to the boundaries that each topic should have. For the 4 topics in Figure 3-31 and their business relationships in the enterprise, it is possible to identify boundary 3-32 as shown.

Figure 3-32 Dividing the subject field

3. Determine the content of a topic

Although the topic only occupies the title position in the packet diagram, but it is the most important part of the information packaging method, when the topic is defined, the logical model in the Data warehouse is basically formed. At this point, you need to include all of the attributes and system-related behaviors in the topic's logical relationship pattern. The data storage structure in the data warehouse also needs to be defined in the design phase of the logical model, with the need to add the required information and attribute groups that adequately represent the subject. As an example of a company data warehouse such as adventure Works cycle, as shown in table 3-7, you can add attribute groups that further describe the topic on the items, sales, and customers topics.

Table 3-7 Detailed description of the topics

Subject Name

Common code Keys

Attribute Group

Commodity

Product number

Product intrinsic information: Product number, product name, type, color, etc.

Commodity procurement information: commodity number, supplier number, supply price, supply date, supply etc.

Commodity stock Information: Product number, warehouse number, inventory, date, etc.

Sales

Sales Tracking Number

Sales order intrinsic Information: Sales number, sales address, etc.

Sales Information: Customer number, product number, sales price, sales, sale time, etc.

Customer

Customer number

Customer Information: customer number, customer name, gender, age, educational level, address, telephone, etc.

Customer economic interest: customer number, annual income, household income, etc.

4. Use of themes

Since the design of Data Warehouse is a spiral development process, at the beginning, it is not necessary to embody all the topics in the database of data Warehouse, it is necessary to choose the most important topic as the touchstone of Data Warehouse design. So using the topic first is to find the subject field that needs to be analyzed.

For example, in the conceptual model design of ADVENTUREWORKSDW Data Warehouse, after analyzing the requirement, it is recognized that the subject of "commodity" is not only the basic business object of a sales enterprise, but also the most important area of decision analysis, so the "Sales Analysis" subject field is defined as the topic to be established first. Through the establishment of the "commodity" theme, the operator can have a more comprehensive understanding of the business situation of the whole enterprise. The first implementation of the "product" theme can be as soon as possible to meet the enterprise management personnel to establish the initial requirements of the Data Warehouse, so select the "product" theme to implement.

The original conceptual model can also be formed by applying the division of the topic boundary to the already obtained relational model. This model is a model that combines the partitioning of the subject domain with the tables in the transactional database, for example, in the above example, the relational table that the commodity subject may cover includes a commodity table, a supply relationship table, a Purchase relationship table, and a warehousing relationship table; a warehouse topic may cover a relational table with a warehouse relation table, a warehouse table, Warehouse Management relationship tables and administrator tables. By linking the keys and fields of these tables, you can form the original conceptual model diagram shown in 3-33.

Figure 3-33 divided the original conceptual model of the subject domain

Data Warehouse topics and Subject fields

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.