I have been interviewing for more than half a year. Bi Engineer Bi Mandatory DW . So I asked why. DW , Bi And DW What relationship? The answer is nothing more Bi Is parasitic on DW And DW Data storage supported by decision-making. But I personally think DW This starts 80 The product of the age is going downhill. In this case, the enterprise must focus on the ROI of a project. DW How much is the investment required? What is the role of a user?
First, I want to correct a wrong concept, some people sayDW is used by senior management, but it is not. Enterprise Management Personnel are generally divided into two parts: Operation and strategy. Both of them need data to support decision-making, but they have different requirements and attitudes towards data.
· operation: Real-time, accurate, and operational data are required. Reports are viewed every week or even every two days, they may even question the difference between 1000 yuan;
·Strategy: you only need to know the general trend when most of the financial data is required, which is about one or two months or one quarter,1 billion and 1.003 billion are not very different for them. Even if they don't need to read reports, they will know the general situation from each meeting and PPT;
Let's take a look.DW: Traditional DW integrates data from various operating systems (ERP and SCM) by developing ETL. These data are operational data. First-line managers can analyze which stores have high sales. But for the VP and cxo strategic management, will they be interested in the store analysis reports submitted by the company's personal consumption business department-China Head Office-Shanghai Branch-marketing department? Therefore, the role of DW is very limited.
Again, I divide enterprises into two categories: traditional enterprises and emerging industries such as small and medium enterprises and the Internet.
For traditional enterprises, their business models are relatively mature, enterprises are large, and they are implemented earlier.ERP systems and it ecosystem are better. So for operators, the main platform they deal with is ERP, and what they need is an integrated system. ERP provides corresponding reports and simple multi-dimensional analysis for different positions and functions. At this time, Bi is integrated into ERP, which is what we previously said about operational bi, oracle is positioned in middleware for Bi. In this case, what is the reason for us to invest more manpower and material resources in DW.
For emerging and small and medium-sized enterprises, they feature fast changes in business models, so they developDW costs and risks may be several or dozens of times the traditional industry. For example, in the Internet industry, more than 10 new feature items may be added to the online system every week. Sometimes new functions are available today and we want to get data through DW tomorrow. If the effect is poor, we will go offline next week. ETL is highly coupled with the source system. The DW team is most afraid of modifying the source database. It is unrealistic to promote metadata management and impact analysis for the entire company. If an ETL direct error is caused by a modification, it is difficult to change the physical structure of the source database, but the business definition of the data is changed. Therefore, in many cases, the DW team can only passively add business users and source system developers. ETL reports an error while being accused of incorrect data.
So what I have collected during the interview isDW is mainly based on three reasons.
1. large data volume, this is especially true for the Internet industry. Online system response requirements are high. In order to ensure that reports do not affect performance, such requirements need to be separated. In this case, you can directly use goldengate and other technologies, synchronizing data to a standby database for query and statistics can avoid the impact of source database modification.
2. multiple data sources and multiple systems. This is a legacy problem that small and medium-sized enterprises often encounter during their development and will be solved slowly. In addition, some situations cannot be avoided, such as log files being integrated into the database and data integration from third-party partners.
3.Data is duplicated into a dimension model for easy creationOLAP. Currently, OLAP modeling tools are powerful enough, and explicit dimensions and fact table definitions are not required in relational data;
For the above three reasons, only2 can be established. Therefore, the data warehouse is now under question. The boss is always concerned about the ROI. He doesn't care about how you do it. The key is what you invest so much in output. Since the data already exists, why is another database another? It is also a database that often hears complaints from operators about unreliable data.