OLTP (online transaction processing) and OLAP (Online Analytical Processing) concepts and optimization, oltpolap
OLTP
, Also called online transaction processing (Online Transaction Processing
), Indicating that the transaction is very high. Generally, it is a high-availability online system. It focuses on small transactions and small queries. When evaluating its system, it generally depends on what it executes per second.Transaction
AndExecute SQL
. In such a system, a single database ProcessesTransaction
Usually more than several hundred, or thousands,Select
The number of statements executed is several thousand or even tens of thousands per second. TypicalOLTP
Systems include e-commerce systems, banks, and securities, such as the United States.eBay
Is a typical business database.OLTP
Database.
OLTP
The most likely cause of system bottlenecks isCPU
And the disk subsystem.
(1)CPU
When a bottleneck occurs, the total number of logical reads is equal to the number of logical reads of a single statement multiplied by the number of executions. If the execution speed of a single statement is fast, however, a large number of executions may also lead to a large amount of logical reads. The design method and optimization method are to reduce the logical reads of a single statement or reduce the number of execution times. In addition, some compute functions, such as user-defined functions,decode
And so on, it will consume a lotCPU
Time, causing the system load to rise, the correct design method or optimization method, you need to avoid the calculation process as much as possible, such as saving the calculation results to the statistical table is a good method.
(2) The disk subsystem isOLTP
In the environment, its carrying capacity generally depends on itsIOPS
Processing capability.OLTP
In the environment, Disk Physical reads are generallydb file sequential read
, That is, a single read, but this read frequency is very frequent. If the disk subsystem is unable to carry itsIOPS
It will cause high performance problems.
OLTP
The commonly used design and optimization methods are:Cache
Technology andB-tree
Indexing Technology,Cache
It is determined that many statements do not need to obtain data from the disk subsystem. Therefore,Web cache
AndOracle data buffer
PairOLTP
The system is very important. In addition, in terms of index usage, the simpler the Statement, the better. In this way, the execution plan is stable, and variable binding must be used to reduce statement parsing, table joining, and distributed transactions, partition technology,MV
Technology, parallel technology, and bitmap indexing. Because of the high concurrency, batch update must be submitted in batches quickly to avoid blocking.
OLTP
The system is a data block that changes frequently,SQL
Systems that frequently submit statements. Data blocks should be stored in the memory as much as possible.SQL
For example, try to use the variable binding technology to achieveSQL
Reuse to reduce physicalI/O
And duplicateSQL
To greatly improve the database performance.
In addition to binding variables, the impact on performance may also be hot and fast (hot block
). When a block is read by multiple users at the same time,Oracle
To maintain data consistency, you must useLatch
To serialize user operations. When a user obtainslatch
Other users can only wait. The more users get the data block, the more obvious they wait. This is the hot issue. This kind of hot speed may be data blocks or rollback blocks. Data blocks are usually caused by uneven data distribution in the database. If the data blocks are indexed, you can create a reverse index to re-distribute the data, for data blocks in a rollback segment, you can add more rollback segments to avoid such contention.
OLAP
, Also called online analysis processing (Online Analytical Processing
) System, sometimes calledDSS
The decision support system is what we call data warehouses. In such a system, the statement execution volume is not an assessment standard,Because the execution time of a statement may be very long, and the amount of data read is also very large.. Therefore, in such a system, the assessment criteria are usuallyThroughput of the Disk Subsystem(Bandwidth), such as how much can be reachedMB/s
.
The throughput of the Disk Subsystem usually depends on the number of disks. At this time,Cache
Basically, there is no effect, and the read/write type of the database is basicallydb file scattered read
Anddirect path read/write
. Use a large number of disks and a large bandwidth, such4Gb
.
InOLAP
In the system, partition and parallel technologies are often used.
Partition Technology inOLAP
The importance of the system is mainly reflected in database management. For example, Database loading can be achieved through partition exchange. Backup can be achieved through backup of Partition Table space, and data deletion can be performed through partitions, as for the performance impact of partitions, it can quickly scan some large tables (only scan a single partition ). In addition, if partitions are combined with parallelism, the entire table can be quickly scanned. In short, the main function of a partition is ease of management. It cannot definitely improve the query performance. Sometimes the partition will improve the performance and sometimes decrease.
In addition to combining with the partitioning technologyOracle 10g
, AndRAC
Combined with the implementation of multi-node scanning at the same time, the effect is also very good, you can put a task, suchselect
Full table scan, evenly distributed to multipleRAC
.
InOLAP
In the system, no binding is required (BIND
) Variable, because the execution volume of the entire system is small, the analysis time can be ignored for the execution time, and the error execution plan can be avoided. HoweverOLAP
Bitmap indexes and materialized views can be used in a large amount. For large transactions, we try to seek Speed Optimization. There is no need to request fast commit like OLTP, or even deliberately slow down the execution speed.
The real purpose of variable binding isOLTP
In the system, this system usually has this characteristic, the user concurrency is large, the user requests are very intensive, andSQL
Most of them can be reused.
ForOLAP
For the system, most of the time the database runs a report job, the execution is basically an aggregation classSQL
Operation, suchgroup by
In this case, set the optimizer modeall_rows
Is appropriate.
For some website-type databases with many paging operations, set itfirst_rows
It will be better. But sometimesOLAP
System, we can considerSQL
China sitehint
. For example:
Select a.* from table a;
Separate Design and Optimization
Pay special attention to the design, such as the high availabilityOLTP
In the environment, do not blindlyOLAP
Technology.
For example, the partition technology assumes that the partition keyword is not used in a wide range, and other fields are usedwhere
If it is a local index, You have to scan Multiple indexes, and the performance becomes lower. If it is a global index, the meaning of the partition is lost.
This is also true of parallel technology, which is generally used when a large task is completed. For example, if you translate a book in real life, you can arrange multiple people to translate different chapters for each person, this improves the translation speed. If you only translate one page of books and assign different people to translate different rows and combine them, it is unnecessary because during the time of assignment, A person may have already translated the text.
The same is true for Bitmap indexes.OLTP
In the environment, blocking and deadlock are easily caused. HoweverOLAP
In the environment, it may improveOLAP
.MV
It is also basically the same, including triggers and so on, inDML
FrequentOLTP
In the system, it is easy to become a bottleneck, or evenLibrary Cache
Wait whileOLAP
In the environment, the query speed may be improved due to proper use.
ForOLAP
System, there is little room for optimization in the memory, increaseCPU
Processing speed and diskI/O
Speed is the most direct way to improve database performance. Of course, this also means an increase in system costs.
For example, if we want to aggregate hundreds of millions or billions of data records, it is very difficult to store all such massive data in the memory, and there is no need at the same time, because the data is rarely reused quickly, it does not make practical sense to cache it, and it also causes physicalI/O
Very large. Therefore, the bottleneck of this system is often the diskI/O
Above.
ForOLAP
System,SQL
The optimization is very important because it has a large amount of data. The performance difference between full table scanning and indexing is very large.