How to create a successful data warehouse (Data Warehouse) (people who want to know about the warehouse look)

Source: Internet
Author: User
Tags implement query
Create | data
How to create a successful data Warehouse (Warehose), the following story will tell you!



The company's ' s ' Warehouse project began with a casual conversation between several executives on their way to LUN Ch. The people involved were the IT manager for decision support as as and as several members of a department that had just Deci Ded to install a data warehouse. They had planned to install their data warehouse without any involvement from the IT department; Nonetheless, the following conversation ensued:
"We urgently need a data warehouse to analyze our data!" " In which case, why don ' t your take an OLAP tool with a multidimensional database? "" Is it possible to make the sales figures available to our sales people? " Yes, course, that ' s no problem because of the its Web capacity. "" We need our answers very fast. " Performance isn ' t an issue, the data requested can is made available on a local server. "" Great, when can we start our analyses? " Installing such a system shouldn ' t take more than a few weeks. "
Encouraged by this casual tips from a expert, the department decided to builds a data warehouse that corresponded to its specific needs. Some months later, the Data Warehouse is installed according to the original specifications. After the "the" successes had become public knowledge within the company, and the other departments began to show interest in the Data Warehouse. Proudly, the system is displayed, and enthusiasm was spreading. Suddenly, each department wanted their own data warehouse, and requests began to pile the IT desks. However, apart from the casual conversation previously, the IT department had ' not been ' the involved of this The Data Warehouse. The project itself had been implemented by the Department and the help of the external System integrator. Nobody had planned on integrating additional user groups. It had become imperative to further develop this successful data warehouse. At this point it became clear that the departmentHad locked itself into a data mart with-limited scalability, instead of building a data warehouse with unlimited CAPA City for expansion. This difference between data marts and data warehouses being a basic issue that the whole company now had to face.
What to do Next?
Based on the situation mentioned, some questions arise, such as:can a data warehouse originally conceived only for one de Partment be used to the whole company, or should new data marts is built for each department? If The latter solution is preferred, how would one department access data from another department? Who would guarantee that all users would receive exactly the same information? The question about whether to start with data marts or a data warehouse has been widely Discussed.1, 2, 3 It can is only a Nswered by clearly defining the project World抯 goals:does it have to cover the information needs for certain departments or are It seen as the ' the ' the ' a shared enterprise information pool. If only departmental needs are the issue, it would suffice to install some data isolated. However, if a company is regards access to a integrated, Company-wide database as critical with its future survival in the MA Rket, then an enterprise Data Warehouse are the solution to implement.

My thoughts, so far, may have created the impression that there are only two options:either quickly install a few data ma RTS to cover a few departments ' current needs for information or embark on the expensive adventure of installing a ENTERP Rise Data Warehouse. There is a third option this combines the best both worlds and can be implemented quickly without sacrificing future GR Owth options. This third option are to lay down the foundation of a enterprise data warehouse through starting with a scalable Data warehouse Framework in a pilot project.
How to proceed
The procedures that leads to this scalable Data Warehouse pilot project are specifically designed to satisfy two Contradictory requirements fast delivery and expandability. Assuming the project is-prepared, it should not take more than three or four months to implement a fully Pilot for a enterprise data warehouse. After it is finished, a company-wide Data Warehouse platform'll be available allowing users to execute concrete analyses and develop a better understanding of their real and shared needs.

What is the difference between a data warehouse pilot and a departmental data mart? In fact, such as two approaches differ more in their strategic goals than at total expenditure required for conceptualizatio N and implementation. To run a project within a department means, you don't have to negotiate with other departments and IT managers Hing that can prove time-consuming. By contrast, if your want to establish a company-wide project, you must coordinate this effort with the other departments, IT m Anagement and top executives.

Figure 1:the process of creating a pilot for a enterprise.

Figure 1 shows a preparatory phase to start the Data Warehouse pilot project. Thorough preparations would ensure that pilot project would not be exceed a three to four the time frame. Among things, the project team would have to clarify technical issues the system and the regarding with the System selected, issues mostly arising out of the chosen Data Warehouse. The selection criteria for the "the" the "computer" is depend upon the amount of data anticipated now ossibly known future), the number and types of the users and complexity of the queries. From the user's point of view, the selection of the "analysis tool" is the most critical issue; However, standardized interfaces like ODBC, it isn't mandatory to stay with a chosen tool. In the beginning, it's sufficient to have a suitable OLAP tool for multidimensional Analysis and software for AC Cessing the Data Warehouse database directly.
Project Design
During the "design phase," the information necessary for implementing the data warehouse must is gathered, such as:
Requirements of the departments regarding the potential information uses; Description of source data used; Definition of business terms, data definitions and transformation rules; Data models for the "the" "The" "The" "the" local data marts.
Simultaneously, the necessary hardware and software must be installed. Basically, the design phase can is broken down into four steps:

Business questions from departments. In order to increase the pilot project's chance for success, the selected business questions need to is of the greatest PO Tential usefulness. Business questions do is necessarily have to be stated as questions. Existing reports that contain key figures or concrete suggestions as to analyses is not possible before can also is used.

Data sources available. After the users ' requests have been roughly analyzed, the IT department must investigate the source systems and interfaces Available within the company. Due to the constraints of the pilot project faces, only data can be considered this is available and meets certain Ty standards, such as completeness and correct contents. For a successful pilot installation, it's important to focus on the most important. Therefore, the business questions must is correlated to the available data.

Business data Model. The business data model reflects the real objects customer, order, product, etc. and their relationships. In order to represent them correctly in the business data model, business rules have to is applied, such as "each Lates to one customer, "or" the customer can belong to various categories. " 4

Logical data Model. The logical data model in its normalized form are based on the business data model, and all objects are and th Eir attributes. Usually, not all attributes available to the source systems are needed for answering the business questions. However, potentially useful data elements would be integrated and so it'll not later is necessary to repeat all the analyses Performed for the pilot project.


Figure 2:all The data is available through the access layer.

For reasons of performance or because query tool requires it denormalized data models are needed a alongside Ed data Model.5 One possibility is to complement the normalized data model with summary tables. In another approach, so-called "star schema" or "star models" are created in addition to the normalized data model ( Gure 2). Together with the normalized data, they are available to uses views on the database. Each time, the data warehouse are accessed through the security layer, in which all the access authorizations are. The normalized approach provides a magnitude of much greater capability and scalability into allowing for any Asked of the data and also to easily add, the future to the Data Warehouse database.

Figure 2 represents the data marts as logical structures, i.e., the numbers are the recalculated each time they are. When starting the pilot project, the the ' the ' the ' the ' the ' to create ' data marts logically. Only if performance is really lacking, would they be physically implemented by using fact tables, since optimizing Performa NCE is no the pilot project World抯 top priority. If the performance is acceptable, it's sufficient to begin analyzing the data selected. More fine-tuning of both the database and the tools utilized should is accomplished after the users/managers have to The their experiences and new value.

A note With regard to this two different data models:the normalized data model represents all business And, therefore, should is changed without very good reasons for doing. By contrast, the data marts are to the most part based on star schema models and contain data for specific subject. If an organization utilized star models and there are changing business, the data requirements marts to be have and/or re-adjusted to meet any new business requirements. This can is very expensive and also limit the future scalability and growth of your data warehouse.
Checking the Results
The last step before adopting the logical data model be to check it by using selected business queries. A typical business query may: "Give me all sales in the" a specific month, broken down into industries (I.e.,hotels and Restaurants only); Number of transactions; Types of customers; and mode of payment.

Using this query, System integrator and users check the Model table by table to find possible interpretative errors of the Data Modeler. Experience shows that end users are very the able to understand a logical data model even if they have never one Fore. Particularly for direct queries to the database, a profound understanding of the Data Warehouse data model is a necessity.
Implementation of the design
After finishing the "design phase" with the system check outlined, the design'll be implemented on the target system with The creating analysis and reports. The implementation phase consists of all steps necessary to transfer data from the operational into the data systems Ouse.

Core steps for success in using this method are:
The transformation of the logical data model into a physical data structure on the target system; The creation of extraction and transformation programs; The implementation of the required control procedures to periodically update the data in the Data Warehouse; Users defining and testing their new analysis and reporting.
A pilot project for a enterprise data warehouse'll usually contain only a few gigabytes of data, involve one or two DEP Artments and two to four source systems.

It may require the coordination between departments, IT managers and company executives than the does a quickly installed, is Olated Data mart, but these efforts'll really pay off once the pilot data warehouse be up and running. Your company'll be is using this platform both for it current informational needs and with a eye to the future as Your BU Siness and your requirements expand.






Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.