BI Dock-Data warehousing technology (Warehouse)

Source: Internet
Author: User

Before you start spraying this topic, let's take a look at the official definition of the Data warehouse:

数据仓库(Data Warehouse)是一个面向主题的(Subject Oriented)、集成的(Integrate)、相对稳定的(Non-Volatile)、反映历史变化(Time Variant)的数据集合,用于支持管理决策。以上是数据仓库的官方定义。

"Operational database" such as the Bank accounting system database, every business operation (such as you save 5 yuan), will be immediately recorded in this database, over time, full belly accumulation is fragmented data, this dry dirty live dirty not cannot database called "Operational database", oriented to business operations.

"Data Warehouse" for decision support, analytical data processing, different from the operational database; In addition, the data warehouse is an effective integration of multiple heterogeneous data sources, the integration followed by the subject of reorganization, and contains historical data, and the data stored in the Data warehouse is generally no longer modified.

操作型数据库、数据仓库与数据库之间的关系,就像 C:、D: 与硬盘之间的关系一样,数据库是硬盘,操作型数据库是 C:,数据仓库是 D:,操作型数据库与数据仓库都存储在数据库里,只不过表结构的设计模式和用途不同。  那么为什么要在操作型数据库和 BI 之间加这么一层“数据仓库”呢?一是因为操作型数据库日夜奔忙,以快速响应业务为主要目标,根本没精力伺候 BI 这边的数据需求,而且 BI 这边的数据需求通常是汇总型的,一个 select sum(xx) group by xx 就能让操作型数据库耗费大量资源,业务处理跟不上趟,麻烦就大了,比如你存了 5000 元钱,发现十分钟后钱还没到账,作何感想?一定是该银行的领导在看饼图?二是因为企业中一般存在有多个应用,对应着多个操作型数据库,比如人力资源库、财务库、销售单据库、库存货品库等等,BI 为了提供全景的数据视图,就必须将这些分散的数据综合起来,例如为了实现一个融合销售和库存信息的 OLAP 分析,BI 工具必须能够高效的取得两个数据库中的数据,这时最高效的方法就是将数据先整合到数据仓库中,而 BI 应用统一从数据仓库里取数。将分散的操作型数据库中的数据整合到数据仓库中是一门大学问,催生了数据整合软件的市场。这种整合并不是简单的将表叠加在一起,而是必须提取出每个操作型数据库的维度,将共同的维度设定为共用维度,然后将包含具体度量值的数据库表按照主题统一成若干张大表(术语“事实表”,Fact Tables),按照维度-度量模型建立数据仓库表结构,然后进行数据抽取转换。后续的抽取一般是在操作性数据库负载比较小的时候(如凌晨),对新数据进行增量抽取,这样数据仓库中的数据就会形成积累。

Most BI applications do not require real-time data, such as decision-makers, only need to see Weekly weekly week, 95% of BI applications do not require real-time, allowing data to be 1 hours to 1 months lag, which is the application characteristics of decision support system, This lag interval is the time when the data extraction tool is working. Of course, the BI application will often also contain a very small demand for real-time data, only for these special needs, the BI querying software directly connected to the business database, but must limit the load, prohibit the complex query.

The current database products are specifically optimized for the data warehouse, for example, when installing a high version of MySQL, the installation sequence will ask whether you want the database instance as transaction-oriented, or decision support, the former is the operational database, The latter is the data Warehouse (decision support, and then effigy again), for both forms, the database will provide targeted optimization.

From: http://www.powerbibbs.com/thread-131-1-1.html

BI Dock-Data warehousing technology (Warehouse)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.