An overview of online analytical processing systems

Source: Internet
Author: User
Tags date format define array benchmark file size query access
Online analytical Processing Overview development background
With the wide application of database technology, enterprise information system has produced a lot of data, how to extract information from these massive data to enterprise decision analysis is an important problem that the enterprise decision management personnel faces. The traditional Enterprise database system (MIS), which is the online transaction processing system (on-line Transaction processing, referred to as OLTP) as the data management means, is mainly used for transaction processing, but its support to analysis and processing is not satisfactory. Therefore, people gradually try to process the data in OLTP database, and form a comprehensive, analytical and better decision support System (Decision Support system, abbreviation DSS). The data of enterprise's information system is generally managed by DBMS, however, the decision database and operation database have different characteristics and requirements in data source, data content, data mode, service object, access mode, transaction management and inability storage, so it is not appropriate to establish DSS directly on the operation database. Data Warehouse (Datawarehouse) technology is developed in this context. The concept of Data Warehouse was proposed in the mid 1980s, the 1990s, Data Warehouse has been from the early stage of exploration to the practical stage. The industry-recognized concept of data warehousing, founder W.h.inmon, defines a data warehouse in the book Building the Datawarehouse: "Data warehousing is a theme-oriented, integrated, and time-changing, persistent collection of data that supports the management decision process." The process of building a data warehouse is to extract data from an OLTP database distributed throughout the enterprise based on a pre-designed logical pattern and ultimately to form enterprise-wide unified pattern data through the necessary transformations. The core of the current data warehouse is still a database system under RDBMS management. Data Warehouse in a large amount of data, in order to improve performance, RDBMS generally take some measures to improve efficiency: the use of parallel processing structure, new data organization, query strategy, indexing technology and so on.

Many applications, including online analytical processing (On-line Analytical processing, OLAP), drive the emergence and development of data Warehouse technology, and the Data Warehouse technology, in turn, promotes the development of OLAP technology. The concept of online analytical processing was first proposed by E.f.codd, the father of the relational database, in 1993. Codd that online transaction processing (OLTP) can not meet the requirements of end-user query analysis, and the simple query of SQL to large database can not meet the needs of user analysis. The user's decision analysis needs a lot of calculation to the relational database to get the result, and the result of the query can't meet the demands of the decision-makers. Therefore, Codd puts forward the concept of multidimensional database and multidimensional analysis, that is, OLAP. The OLAP committee defines online analytical processing as: Enabling analysts, managers, or executives to quickly, consistently, and interactively access information that is translated from the original data from a variety of perspectives and that can truly be understood by the user and that truly reflects the enterprise dimension characteristics. To obtain a more in-depth understanding of the data of a class of software technology. The goal of OLAP is to meet the needs of decision support or multidimensional environment-specific query and report, its technical core is the concept of "dimension", so OLAP can be said to be a collection of multidimensional data analysis tools.
OLAP and OLTP
OLAP data sources, like OLTP, come from the underlying database system, but they are different from each other and the characteristics of the data content are different. The difference between the two is summarized in the following table:



OLTP data OLAP data



Raw Data Export Data

Detail data synthesis and refinement data

Current Value data history data

Updatable not updatable, but periodically refreshed

The amount of data processed at one time is small and the amount of data processed is large

Application-oriented, transactional-driven analysis-oriented, analysis-driven

For operational personnel, support day-to-day operations for decision makers, support management needs
Characteristics and evaluation criteria of OLAP
The features of OLAP can be represented by five keywords: fast analysis of shared multidimentional information (FASMI, fast parsing of shared multidimensional information). This is also a guideline for designers or managers to determine whether an OLAP design is successful or not.

Fast: The system responds to user's time to be quite quick, to achieve this goal, the database pattern should move toward the broader technology development, including the special data storage format, the beforehand computation and the hardware configuration and so on.

Analysis: The system should be able to handle any logical analysis and statistical analyses related to the application, and users can define new specialized calculations as part of their analysis without programming, and report in a user's ideal manner. The user can analyze the data on the OLAP platform or connect to other external analysis tools, and provide the flexible and open report processing function to save the analysis results.

Shared: This means that the system will be able to meet the security requirements of data confidentiality, and even if multiple users are using them at the same time, they will be able to see only the information they should see, depending on the level of security they belong to.

The salient feature of Multidimensional:olap is that it can provide data multidimensional view systems must provide multidimensional views and analysis of data analysis, including full support for hierarchical and multi-level dimensions.

Information: Regardless of the amount of data, and regardless of where the data is stored, OLAP systems should be able to obtain information in a timely manner and manage bulk information. There are a number of factors to consider, such as the availability of data, the availability of disk space, the performance of OLAP products, and the degree of integration with the Data warehouse.
OLAP logic concepts and typical operations
OLAP presents a multidimensional view of the user.

Dimension (Dimension): It is a certain angle that people observe data, it is a kind of attribute when consider a problem, attribute set constitutes a dimension (Time dimension, Geography dimension, etc.).

Dimension level: People observe a particular angle of data (that is, a dimension) can also have different levels of detail of various descriptions (Time dimension: date, month, quarter, year).

Member of Dimension: A value of a dimension that is a description of the position of a data item in a dimension. ("One day of the year" is a description of the position on the time dimension).

Metric (Measure): The value of a multidimensional array. (January 2000, Shanghai, notebook computer, $100000).







The basic Multidimensional Analysis operations of OLAP include drillthrough (drill-up and Drill-down), slices (Slice) and Dice (Dice), and rotation (PIVOT).
Drill-through: is to change the dimension of the level, transformation analysis of granularity. It includes a drill down (Drill-down) and a drill up (drill-up)/Roll Up (roll-up)

。 Drill-up is a one-dimensional generalization of low-level detail data to a higher level of aggregated data, or a reduction in dimensions, whereas Drill-down, in contrast, observes or adds new dimensions from the aggregated data to the detail data.
Slices and dice: is when you select a value on a part of a dimension and care about the distribution of the metric data on the remaining dimensions. If the remaining dimension is only two, it is a slice; if there are three or more, it is diced.
Rotation: Is the direction of the transformation dimension, that is, to rearrange the placement of the dimensions in the table (for example, row and column swaps).
Architecture and classification of OLAP systems
The relationship between data Warehouse and OLAP is complementary, modern OLAP system is generally based on data Warehouse, which is to extract a subset of detailed data from the Data warehouse and store it into OLAP memory for the front-end analysis tool to read through the necessary aggregation. The typical OLAP system architecture is shown in the following illustration:

OLAP system according to its memory data storage format can be divided into relational OLAP (relational OLAP, referred to as ROLAP), multidimensional OLAP (multidimensional OLAP, referred to as MOLAP) and mixed-type OLAP (Hybrid OLAP, Abbreviation HOLAP) three kinds.

1. ROLAP

ROLAP stores the multidimensional data for analysis in relational databases and, depending on the needs of the application, defines a batch of real views as tables also stored in the relational database. Instead of saving every SQL query as a real view, define only those queries that have a higher frequency of application and compute a larger workload than the real view. For each query against an OLAP server, prioritize the calculated real view to generate query results to improve query efficiency. The RDBMS, also used as ROLAP storage, is optimized for OLAP, such as parallel storage, parallel queries, parallel data management, cost-based query optimization, bitmap indexing, SQL OLAP extensions (Cube,rollup), and so on.

2. MOLAP

MOLAP physically stores the multidimensional data used by OLAP analysis as a multidimensional array form, forming a "cube" structure. The attribute values for the dimension are mapped to the subscript or subscript range of the multidimensional array, and the summary data is stored as the value of the multidimensional array in the cell of the array. Since MOLAP uses a new storage structure, which is realized from the physical layer, it is also called physical OLAP (physical OLAP), while ROLAP is implemented by some software tools or intermediate software, the physical layer still adopts the storage structure of relational database, so it is called virtual OLAP OLAP).

3. HOLAP

Because MOLAP and ROLAP have their own advantages and disadvantages (as shown in the table below) and their structures are very different, this presents a challenge for the analyst to design an OLAP structure. To this end, a new OLAP structure--hybrid OLAP (HOLAP)--is proposed to combine the advantages of MOLAP and ROLAP two structures. To date, there has not been a formal definition of HOLAP. However, it is obvious that the HOLAP structure should not be a simple combination of MOLAP and ROLAP structure, but a combination of the advantages of these two kinds of structures, which can satisfy all kinds of complicated analysis requests of the users.






Rolapmolap
The technology of using the existing relational database

Designed for OLAP

Response speed is slower than MOLAP;

The existing relational database has done a lot of optimization to OLAP, including parallel storage, parallel query, parallel data management, cost based query optimization, bitmap indexing, SQL OLAP extension (cube,rollup) and so on, performance improved

Good performance and fast response speed

Data Loading Speed fast

Data loading speed is slow

Small storage space, unlimited number of dimensions

Pre-calculation is required, which may result in data explosion, limited dimensionality, and inability to support dynamic changes of dimensions





Use RDBMS to store data, no file size limit

Limited to the size of the file in the operating system platform, it is difficult to reach TB level (only 10~20g)

The storage of detailed data and profile data can be implemented through SQL

Lack of standards for data models and data access

– Read and write operations that are not supported for estimates

–sql cannot complete partial calculation

• Unable to complete multiple rows of computations

• Unable to complete calculation between dimensions



– Support for High-performance decision support computing

• Complex cross-dimension computing

• Multi-user Read and write operations

• Row-level calculations

Maintenance difficulties

Easy Management


Main OLAP Manufacturer Product introduction
Hyperion

Hyperion Essbase OLAP Server, which has more than 100 applications, and more than 300 developers using Essbase as a platform. There are hundreds of calculation formulas that support the script prediction of the process, and statistical and dimensional based computations.

Powerful OLAP query capabilities, using Essbase query Designer, business users can make complex queries without the help of IT staff.

Extensive application support to expand the value of data warehousing and ERP systems, and to establish analytical procedures for applications such as E-commerce, CRM, Finance, manufacturing, retail and CPG (consumer packaged goods).

Speed-of-thought response time, support multiple users to read and write at the same time

web-enabled, server-centric architecture, support for SMP

Powerful partners provide complete solutions, more than 60 packaged solutions, and more than 300 consulting and implementation companies.

Rich front-end tools, with more than 30 front-end tools to choose from, including Hyperion own Wired for OLAP, Spider-man Web application, Objects, Essbase spreadsheet add-in, Web Gateway, Reporting.

Hyperion Enterprise, a solution for financial integration, reporting and analysis provided by TNCs. More than 3,000 organizations are using the system.

Feature-rich: supports a variety of financial standards US Gaap,canadian Gaap,uk GAAP, International Accounting Standards (ISA), FASB,HGB. Automatic reconciliation of transactions between branch offices. FAS52 currency conversion. FAS94.

Easy to use: The system can be accessed through Excel,lotus 1-2-3 and various browsers.

Support the adjustment of company structure.

The support of multinational corporations: It also supports the legal and tax requirements of 6 languages and different countries.

Complete process Control and audit trail, and security level setting.

Ability to integrate with ERP or other data sources

Hyperion Pillar, Budget and planning tools. More than 1500 users worldwide, providing an activity based budget, project based planning, centralized planning, sales forecasts, and a comprehensive plan.

Distributed architecture

Detailed plan formulation: Allow line managers to develop detailed plans

Sophisticated modeling and analysis capabilities

Oracle

Express Server offers comprehensive OLAP capabilities with over 3,000 users worldwide

Users can use the web and spreadsheet

Flexible data organization, data can be stored in Express server, can also be used directly on the RDB

have built-in analytic functions and 4GL to customize the query for users

Cognos

Powerplay provides a comprehensive reporting and analysis environment for business efficiency evaluation BPM (Business performance measurement). To provide decision-makers with a variety of critical data on operational efficiency.

You can browse multidimensional data by clicking and dragging with the mouse only

Automated use of Web publishing analysis reports

Supports a variety of OLAP server:microsoft OLAP Services, Hyperion essbase, SAP BW, IBM OLAP for DB2

Complete authorization and security system

Novaview is a client application for Microsoft SQL Server 7.0 OLAP services.

MicroStrategy

MicroStrategy 7 is a new generation of intelligent platforms (Intelligence Platform) for e-business applications e-business and E-Customer relationship management eCRM.

With strong analytical skills.

A web-centric interface

Supports millions of users and terabytes of data

Rapid development capabilities, direct access to existing data patterns

Intelligence Server,one for all analytic applications

Microsoft

SQL Server 7.0 OLAP Services, an OLAP module for SQL Server 7.0, can use any relational database or flat file as a data source, where the PivotTable Service provides the data caching and computing power of the client.

Intelligent Client/server Data management to improve response speed and reduce network traffic

Allows different client access through OLE DB for OLAP

BusinessObjects

BusinessObjects is an easy-to-use bi tool that allows users to access, analyze, and share data.

can apply a variety of data sources: Rdb,erp,olap,excel

VBA and open object models can be applied for development customization

Ibm

DB2 OLAP Server is a powerful multidimensional analysis tool that integrates the Hyperion Essbase OLAP engine with the DB2 relational database.

Fully compatible with the Essbase API

Data is stored in relational database DB2 with star model

Brio

Brio.enterprise is a powerful, easy-to-use BI tool that provides the ability to query, OLAP analysis and report

Supports multiple languages, including Chinese

Brio.report, a powerful enterprise-level reporting tool


OLAP related standards
APB-1 OLAP Benchmark release II (sponsored by OLAP COUNCIL)

The benchmark simulates a realistic OLAP (On-line Analytical Processing) business situation that exercises server-based so Ftware. The goal of the APB-1 is to measure a server ' s overall OLAP performance rather than the performance of specific tasks. To ensure the relevance of the APB-1 to actual business environments, the operations performed on the database have been C Arefully chosen to reflect common business operations. For the purposes of comparing the performance of different combinations of hardware and software, a standard benchmark met Ric called AQM (analytical Queries per Minute) instead to AQT (analytical Queries time) has been.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.