Oracle Performance Myths--mts,rac, partitioning, parallel queries
To improve performance, we've done a lot of testing on the methods or scenarios that the Oracle database itself provides
Mainly include:
Shared server mode (MTS)
Cluster technology (clustering), RAC
Partition
Parallel processing (mainly parallel queries)
These features provided by Oracle are indeed used for performance improvements, but we tend to overlook the analysis of our own application features and whether they are appropriate for us.
Recently, through a deep understanding of this knowledge, we found that we had some misconceptions before. I think it is necessary for us all to change this misunderstanding.
Before we analyze, let's be clear about our application features.
Database applications can be roughly divided into OLAP and OLTP two categories, namely: Online Transaction analysis (data warehousing) and online transaction processing (transaction applications)
Our application system has the characteristics of online transaction processing and a small amount of data warehouse.
1. Shared Server (MTS)
Oracle defaults to the dedicated server mode, which means that a user connection process corresponds to a server process.
I remember when a big hospital just opened, we have tried MTS. Because I heard that MTS is connecting more clients without adding memory and CPU, the result is not what we expected.
Is there a problem with MTS? No, because we don't know about MTS, it's not that it has a problem, but it's not used to do it in this situation.
In general, MTS is recommended only if the number of concurrent connections exceeds the support of the operating system, otherwise the default dedicated server mode should be used.
That is, in dedicated server mode, because more than one connection consumes more than one operating system process, MTS is only necessary when concurrent application requirements exceed the allowable number of connections to the operating system.
If you have an existing system that physically supports a dedicated server database with 100 connections, instead of using Shared server mode, you may support 1000 connections, but there may be only 100 active connections.
Generally 2 to 4 CPU servers, 200 to 400 concurrent connections is sufficient, if the connection increased, you can increase the CPU and memory.
MTS has some of the following drawbacks:
1. The code path for a shared server is longer than a dedicated server, so it is inherently slower than a dedicated server.
2. There is the possibility of an artificial deadlock, because it is serial, all shared servers are bound together (a process), as long as one connection is blocked, all users are blocked and most likely to deadlock.
3. There is a possibility of an exclusive transaction, because if a session is running for too long, it exclusively shares resources and other users can only wait. (and dedicated servers, each client is a session)
4. The shared server model restricts certain database features, such as:
You cannot start and close the instance individually, you cannot perform media recovery, you cannot use log Miner, you cannot use it, and sql_trace is meaningless (because it is shared rather than the current session).
The reduced memory of MTS is actually the memory required for each user to connect to the operating system process in the dedicated server mode, but it uses the large_pool of the SGA to allocate UGA, robbing Peter, and reducing the memory is very small.
If a user session is connected and disconnected very frequently, the cost of creating and deleting the database process is very high, preferably in a shared server mode (otherwise, connection pooling technology should be used).
Fortunately, the design of our product may have taken this factor into account, using a lifetime of connectivity (within the session lifecycle) to avoid this situation.
So, to sum up, for our products, we recommend the default dedicated server mode, when the connection is not enough, by adding hardware to resolve, rather than using MTS instead.
In addition, Oracle can in fact support both shared and dedicated server mode, specifying one session to use a dedicated server and another session using a shared server.
2. Cluster technology (RAC)
Oracle RAC (real application clusters), we say that dual-computer fault tolerance is a kind of RAC.
The advantage of clustering technology is that it extends performance horizontally and provides high availability.
32-bit operating systems have 4G of memory limitations, and some UNIX systems (and non-advanced versions of Windows) have a limited number of CPUs.
and cluster technology through the collection of multiple machines work together, horizontal break this limit.
Through RAC, a single server instance, multiple machines form an instance service set, and the client connects to it.
This technology, we sometimes say to the customer is load balanced, actually this is one-sided, RAC is mainly for CPU and memory load balancing,
Does not implement load balancing for disk IO. (Of course, disk IO can be implemented via RAID or NAS)
Another benefit of RAC is that it improves usability, which means a server is broken (note: Not a data storage medium) and does not affect normal use.
Like load balancing, it improves availability above the data tier, but not all, because the data is broken and there is no way to do it.
(data layer, that's the Oracle data Guard, or simply say it's storage hardware.)
However, the benefits of RAC also bring about performance impact.
Because it wants to globally coordinate the data cache and ensure that the cached data that is connected to each instance is consistent, the following three contradictions are magnified:
1. Cache contention
2. Excessive I/O
3. Lock
That is, if there is a problem with the RAC, the problem will be much greater, for example, if the SQL does not use binding variables to cause cache contention, it is more serious to use RAC.
All in all, if your server's CPU is plugged in and your memory is up to the limit, and concurrent users are growing, or you're asking for a very high downtime, RAC is definitely your choice.
3. Zoning
The purpose of Oracle partitioning is to divide large tables or indexes into small fragments for easier management.
We may have mistakenly thought that zoning is fast=true, it can improve speed, and we have done experiments in oncology and pediatrics.
In fact, in a transactional system, partitioning generally does not speed up query speed (in some cases, it may reduce contention for shared resources).
Oracle partitioning features, mainly for the data warehouse to design, that is, if you have a table of 100G size, preferably using partitions, the advantages of the following three aspects:
1. Improve Usability
Partitioning is the principle of divide and conquer, if a table is divided into multiple partitions, one of the partitions in the media out of the problem, does not affect the entire table of other partitioned data access.
2. Easy to manage
In the Data Warehouse, the table is divided into small pieces, easier to bulk Delete, defragment, and some parallel processing.
3. Improve performance
In this regard, it is most difficult to achieve by partitioning, and it must be carefully calculated to arrange the partition data.
Zoning planning is complex, take our product application, the general query involves multiple tables, multiple indexes,
Suppose we set up a division for the patient's expense record, the drug delivery record, and the patient's doctor's order record.
Obviously, the range partitioning is not very useful for us to improve performance, the hash partitioning is our query requirements, but most of the data hashing is not concentrated.
Plus, there are so many indexes on these tables, common IDs, time class indexes,
Few people can do all of them globally or accurately range partitioning (in fact, it may not be possible to have multiple indexed range partitions on demand).
If the query often involves multiple indexes, how to ensure that every index used is on a partition, if not, must scan multiple partitions, increase logical I/O and CPU time, thus increasing the query time.
(data is distributed across different physical storage media, discussed in parallel processing below)
Again, some situations may reduce contention for shared resources, meaning that parallel modifications and updates are faster.
Careful analysis, what is the principle of our zoning? The most common possibility is to partition the range by time, so that most of the modifications and updates are made on the same partition,
So the contention for reducing shared resources is basically ineffective.
(Is there a unique application requirement for a hash partition by Department ID?) is there a unique application requirement based on a list partition (typical eigenvalue)? basically no.
Partitioning mainly improves performance from a parallel perspective, but the application of the transaction processing system itself determines that it is not suitable for this technique.
In other words, there is no need to adopt zoning technology for the characteristics of transaction processing of our products.
4. Parallel processing
According to our application characteristics, the main analysis of parallel query. General requirements with partition characteristics, multiple CPU hardware.
Since Oracle 8.1.6, an option has been added to automatically tune parallel queries: parallel_automatic_tuning=true
By setting the parallel parameter on the corresponding table, Oracle automatically parallelization the actions on the table when appropriate.
Parallel queries are essentially useless for transaction processing systems.
Because the design of the parallel query is for the single user in the Data warehouse completely consumes 100 of the resources to do.
In transaction processing, there are often many concurrent users, they compete with shared resources, so you find a way to let a user occupy all the resources is counterproductive.
These are my learning understandings of some of the common solutions to Oracle performance, and I hope to help you get a good understanding of these topics.