What is OLAP and OLTP Source: http://www.itisedu.com/phrase/200604230827105.html Http://www.huihoo.com/database/dw2.html
Today's data processing can be roughly divided into two categories: online transaction processing OLTP (on-line transaction processing), Online Analytical Processing OLAP (on-line analytical
Processing ).OLTP is the main application of traditional relational databases, mainly for basic and daily transaction processing, such as bank transactions. OLAP is the main application of the data warehouse system. It supports complex analysis operations, focuses on decision support, and provides intuitive and easy-to-understand query results. The following table lists the comparison between OLTP and OLAP.
|
OLTP |
OLAP |
User |
Operator and lower-level management personnel |
Decision makers and senior management personnel |
Function |
Routine operations |
Analysis and decision-making |
DB Design |
Application-oriented |
Subject-oriented |
Data |
Latest, detailed, two-dimensional, discrete |
Historical, aggregated, multidimensional, and integrated |
Access Scale |
Read/writeSeveral (or even hundreds) records |
ReadMillions (or even hundreds of millions) records |
Operation Frequency |
Very frequent (in seconds) |
Relatively loose (in hours or even weeks) |
Work Unit |
Strict transactions |
Complex Query |
Number of users |
Hundreds to tens of millions |
Several-hundreds |
DB size |
100mb-gb |
100gb-tb |
OLAP)The concept was first proposed by E. F. codd, the father of relational database, in 1993. He also proposed 12 principles for OLAP. The proposal of OLAP has caused a great deal of response. As a kind of product, OLAP is used as a kind of parallel machine transaction processing.
(OLTP) is clearly differentiated.
OLAP enables analysts, managers, or executors to quickly, consistently, and interactively access information from multiple perspectives to gain a deeper understanding of data.Software technology. OLAP is designed to meet decision-making support or specific query and report requirements in multi-dimensional environments. Its core technology is the concept of "dimension.
"Dimension" is a high-level classification from the perspective of observing the objective world. Dimensions generally contain hierarchical relationships, which are sometimes quite complex. By defining multiple important attributes of an object into multiple dimensions, you can compare data in different dimensions. Therefore, OLAP is also a collection of multidimensional data analysis tools.
OLAP basic multidimensional analysis operations include drilling (roll up and drill down), slice, dice, rotation, drill SS, and drill through.
· Drilling is used to change the dimension hierarchy and the granularity of analysis. It includes roll up and drill down ). Roll up summarizes low-level detailed data to high-level summary data in one dimension, or reduces the dimension, it goes from summarized data to detailed data to observe or add new dimensions.
· Slice and slice are concerned with the distribution of measurement data on the remaining dimension after a set value is selected on some dimensions. If there are only two remaining dimensions, the slice is used; if there are three, the slice is used.
· Rotation is to change the direction of the dimension, that is, to reschedule the placement of the dimension in the table (for example, row-column Interchange ).
OLAP has multiple implementation methods. Different data storage methods can be divided into ROLAP, molap, and holap.
ROLAP indicates the relational database-based OLAP implementation (Relational OLAP ). With relational databases as the core, multidimensional data is represented and stored in a relational structure. ROLAP divides the multidimensional structure of a multi-dimensional database into two types of tables: fact tables used to store data and dimension keywords, and dimension tables, that is, at least one table is used for each dimension to store the description information of dimension levels, member categories, and other dimensions. A dimension table is associated with a fact table by the primary keyword and the external keyword to form a "star mode ". For complex hierarchical dimensions, to avoid occupying too much storage space for redundant data, you can use multiple tables to describe this star mode extension called "Snowflake mode ".
Molap indicates the implementation of OLAP based on multi-dimensional data organization ). Taking multi-dimensional data as the core, that is, molap uses multi-dimensional arrays to store data. Multi-dimensional data will form a "cube" structure in storage, in molap, "Rotate", "cut", and "slice" of "cube" are the main technologies used to generate multidimensional data reports.
Holap indicates the OLAP implementation (Hybrid OLAP) based on the hybrid data organization ). For example, the lower layer is relational and the higher layer is multi-dimensional matrix. This method provides better flexibility.
There are other ways to implement OLAP, such as providing a dedicated SQL Server and providing special support for SQL queries in some storage modes (such as star and snowflake.
OLAP is an online data access and analysis tool for specific problems. It analyzes, queries, and reports data in multiple dimensions. Dimension is a specific angle for people to observe data. For example, when considering the sales status of a product, an enterprise usually observes the sales status of the product from different perspectives of time, region, and product. The time, region, and product here are dimensions. Different combinations of these dimensions and multidimensional arrays composed of the measured indicators are the basis of OLAP analysis, which can be formally expressed as (Dimension 1, dimension 2 ,......, Dimensions N, metrics), such as (Region, time, product, sales ). Multi-dimensional analysis refers to the use of slice, dice, drill-down, and roll-up for multi-dimensional data) in order to analyze the data, users can observe the data in the database from multiple perspectives and aspects, so as to gain a deep understanding of the information contained in the data.
According to the different organization methods of comprehensive data, currently common OLAP mainly includes multi-dimensional database molap and relational database-based ROLAP. Molap organizes and stores data in multiple dimensions, while ROLAP uses the existing relational database technology to simulate multi-dimensional data. In data warehouse applications, OLAP applications are generally the front-end tools of Data Warehouse applications. At the same time, OLAP tools can be used together with data mining tools and statistical analysis tools to enhance the decision analysis function.
Online Transaction Processing (OLTP)A computer network is used to connect a business processing computer device or network distributed in different geographical locations to a business management center network, this allows you to perform unified and real-time business processing activities or customer services on any network node.
In DBMS, transactions are discrete units of work. For example, a database transaction can be used to modify a user's account balance or write operations on inventory items. The online transaction processing system (OLTP) collects and processes transaction-linked data in real time and changes in the status of shared databases and other files. In online transaction processing, the transaction is executed immediately, which is opposite to batch processing. a batch of transactions are stored for a period of time and then executed. Most batch processing (such as account exchange) is performed at night. The OLTP results can be obtained immediately in this database. Here we assume that these transactions can be completed. Online transaction processing occurs in real time. The civil aviation ticketing system and bank ATM are examples of online transaction processing systems.
It is simple to execute transactions in a single user or database environment because there are no conflicts or requirements for inter-Database Synchronization. In a distributed environment, maintaining the integrity of multiple databases is another problem. Traditionally, most online transaction processing systems are implemented on large-scale computer systems because of the complexity of their operations and the need for fast input/output, prohibition, and management. If a transaction must be modified on multiple sites, a management mechanism is required to prevent data rewriting and synchronization. Other requirements include the ability to roll back failed transactions, security features, and the ability to recover data as needed. This is handled by a Transaction Processing Supervisor. This supervisor ensures that the transaction is completely completed or rolled back, so that the database status is correct.
In a distributed environment, write operations often occur concurrently on multiple database servers. This kind of concurrent transaction processing requires a "Roll Back" mechanism to ensure database integrity even if the system fails during a write operation. The transaction is either confirmed together or abandoned. If one or more transaction-related system responses are inconsistent, this means that the system or communication may fail and thus a transaction will be abandoned.
It can be seen that when multiple users try to change the same data block at the same time, there is a conflict. In addition, write operations on multiple databases must be synchronized, And the write operations must be completed by all databases. A monitoring program is required to ensure data integrity. There are four requirements for transaction processing in a distributed environment, called acid ".
Atomicity: defines an independent unit of work. If a transaction is distributed, all sub-transactions that affect data in the separated locations must be executed together as a transaction. To maintain data consistency across multiple locations, we need to use the two-phase approval process described below.
Consistency: consistency is basically a requirement for a database to change from one State to the same State. The transaction supervisor must verify that all affected data is consistent.
Isolation: the transaction must be executed in isolation until it is completed and not affected by other transactions during execution.
Durability: This nature is carried out together with the final confirmation of the transaction. Once a transaction is tested to be correct for all affected systems, it is recognized and does not need to be rolled back.
Online transaction processing is widely used in finance, securities, futures, information services, and other systems.
For example, the Banking Business Network of the financial system can cover the whole country and even the whole world through dial-up lines, leased lines, group exchange networks, and satellite communication networks, so as to realize a wide range of savings business exchanges, funds are cleared and transferred nationwide at any branch or Subbranch.
On the ATM network, you can use a credit card to obtain withdrawals, deposits, transfers, and other services from any ATM.
On the futures and securities trading network, all member companies throughout the country can make quotations, transactions, delivery, settlement, and information queries on their local computers. In addition, the air ticket booking system is also a typical online transaction processing system that provides air ticket booking and ticketing services nationwide and even globally.
From: http://www.huihoo.com/database/dw2.html
Http://www.huihoo.com/database/dw2.html
From: http://blog.sina.com.cn/s/blog_3e902bd3010006l9.html