OLAP--The ODS project summary--key in BI

Source: Internet
Author: User

The project was completed years ago, and there are a lot of small questions to look back at. It's a little messy. or from the need to talk about.

First, the demand for each industry is different. Difficult to unify. In general, these are the areas

1. Time Window

Common classifications also for 1 categories of ODS, type II ODS, category III ODS

Class I ODS: Data latency with application systems is 1-2 seconds, real-time or approximate real-time

Class II ODS: 2-4 hours of data delay with the application system

Category III ODS: 12-24 hours of data latency with application systems

Category IV ODS: Part of the decision analysis data in the Data warehouse is returned to ODS

The higher the data real-time, the better the CPU, the higher the cost of software. The selection is also different,

If it is determined that real-time data needs to be synchronized in real time, it is the Class I ODS, which usually requires the mechanism of EAI, Message Queuing and message communication. A little bit almost can use some of the advanced features of the database, such as Oracle from the redo log extract, the current support vendors are many, the bottom line is to use database triggers, the workload is very large, are some boring repetitive code, reusability is not high.

Class II ODS this seems to be a little more, before the bank transfer has a few hours after the business of the account. It is now very rare to build such an estimate using a higher performance Class III ODS.

Category III ODS is a very common, often said ETL, that is, batch data processing is such a must match items. Manufacturers are also many, but to be measured in terms of ease of use, performance, and the combination of local databases.

We use this framework. The use of software ORACLE,IBM is basically also large manufacturers.

Category IV ODS is generally the data that is aggregated on ODS data. A friend who does data analysis, dealing with such systems, such as sas,spss,r.

2. Data volume level

Any data as long as the magnitude comes up, it is very difficult. We've done test data throughput at the G level, and using a traditional database can barely be done. If you exceed this level, no matter in etl,dataanylse you are not from the heart.

There is a need to use big data architectures, and not full use of big data, but a combination of big Data + traditional databases. We are currently testing this program. Many of these architectures have to be changed, the more fatal is the ETL becomes more complex, the traditional ETL tools many have not followed.

If the amount of data is higher than the PB level, all the previous architectures have to be pulled back, using a pure Big data architecture, which is not a common company can do. Don't talk about it for the moment.

3. Data Attribute Validation

This takes up a lot of our work on ODS modeling (similar to BI modeling),

Dimensional data and fact table data (log data) are important guarantees that we do not deviate from the business.

Data sources (JMS,DATABASE,FILE,EAI) which involve the processing of different technologies.

Data processing (statistical, non-statistical): is the key to affect the performance of ETL.

Ext.: http://www.cnblogs.com/jerryxing/archive/2013/02/20/2918130.html

OLAP--The ODS project summary--key in BI

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.