System Bottleneck Analysis
First, the system bottleneck analysis example
Example 1 :
The response time of the transaction, if it is long and far exceeds the system performance requirements, represents a CPU-consuming database operation, such as sorting, performing aggregate functions (for example, Sam,min,max, Count), you can consider whether there are indexes and whether the index is reasonable, try to use simple table joins, horizontally split large tables, and so on to reduce the value.
Example 2 :
The segmentation excludes errors. Test tools can simulate different virtual users to access the Web server, Application server and database server separately, so that the response time measured on the web side minus the time measured by each of these segments can be known where the bottleneck and start tuning.
Example 3 :
UNIX Resource monitoring (NT operating system in the same way) indicator memory paging rate (Paging), if the value is occasionally higher, indicates that the thread is competing for memory at that time. If you continue to move higher, the memory may be a bottleneck. It is also possible that the memory access hit ratio is low. There is a similar explanation for "swap in" and "swap out rate".
Example 4 :
UNIX Resource monitoring (NT operating system similarly) indicator CPU utilization (CPU utilization), if the value continues to exceed 95%, indicating that the CPU is a bottleneck, you can consider adding a processor or a faster processor. Reasonable use range from 60% to 70%.
Example 5 :
UNIX Resource monitoring (NT operating system similarly) in the indicator disk rate, if the value of the parameter value has been high, indicating that I/O is problematic. Consider replacing a faster hard drive system, redeploying business logic, and so on, and setting Tempdb in RAM to reducemax async IO,as well as the max lazy writer Io.
Example 6 :
Tuxedo the number of bytes in the indicator queue in the Resource Monitor (Bytes on queue), the length should not exceed 1.5~2 times the number of disks . To improve performance, you can increase the disk. Note: A RAID disk actually has more than one disk.
Example 7 :
In SQL Server resource monitoring, the indicator cache click-through ratio (cache hitratio), the higher the value the better. If it lasts below 80%, you should consider increasing the memory. Note the value of this parameter is incremented since SQL Server was started, so it does not reflect the current value of the system after it has been running for some time.
Second, the System optimization adjustment settings
1. CPU Questions:
Consider using a more advanced CPU instead of the current CPU.
For multi- CPU, consider the load distribution between CPUs.
Consider designing systems on other systems, such as adding a front-mounted machine, setting up a parallel server, and so on.
2. Memory and cache:
Memory optimization includes operating system, database, and application memory optimization.
Excessive paging and swapping can degrade the performance of the system.
Memory allocation is also a major factor that affects system performance.
Ensure that the reserved list has a large contiguous block of memory.
Resizing the block buffer (represented by the number of blocks) is an important piece of information.
Save the most frequently used data in the store.
3. Disk (I/O) resource issues
Disk read and write progress is critical to the database system, and the reasonable distribution of database objects on physical devices can improve performance.
Disk mirroring slows down disk writes.
The performance of the system can be improved by distributing logs and database objects on separate devices.
By placing different databases on different hard drives, you can improve read and write speeds. Always put the database, rollback segments, logs on different devices.
Put the table on a hard disk, the index of the cluster on another hard disk, to ensure that physical read and write faster.
4. Adjust configuration parameters
Includes the operating system and database parameter configuration.
Parameters for the parallel Operation resource limit (number of concurrent users, number of sessions).
Parameters that affect resource overhead.
parameters related to I/O.
5, optimize the application system network settings
You can reduce network calls by using an array interface. Instead of extracting one row at a time, it is more efficient to fetch 10 rows in a single round trip .
Adjusts the buffer size of the session data unit.
The shared server process provides better performance than a dedicated service process.
Third, database server performance problems and causes analysis
1, a single type of transaction response time is too long
Database server load, poor data design, transaction granularity, the impact of batch tasks on the performance of ordinary users.
2, the concurrent processing ability is poor
3. Serious lock conflict
Database transaction Timeout, database deadlock caused by resource lockout.
Iv. database-related
1, the database performance problem general solution
Monitor performance-related data.
Locate resources that consume larger transactions and make the necessary optimizations or adjustments.
Positional lock collisions, modifying lock collisions occur with severe application logic.
Distribution of larger data or lock collisions that cannot be resolved by general optimization.
2.Oracle features related to improved performance
indexing, parallel execution, cluster and hash clusters, partitioning, multi-threaded servers, and simultaneous reading of multiple pieces of data.
3. Oracle key parameters for configuration
Max_dspatchers: This parameter specifies the maximum number of simultaneous scheduling processes that the system allows.
Max_shared_servers: This parameter specifies the maximum number of shared server processes that the system is allowed to make concurrently. If there are too frequent human deadlocks in the system, the administrator should increase the value of this parameter.
Parallel_adaptive_multi_user: When the value of this parameter is true, the system initiates an adaptive algorithm that improves the performance of multi-user systems using parallel execution. This algorithm automatically reduces the degree of parallelism of query requests based on the system load at the beginning of the query.
Parllel_min_servers: This parameter specifies the minimum number of concurrent execution processes for an instance. The value is the number of parallel executions created by Oracle when the instance starts.
PARLLEL_THREADS_PER_CPU: This parameter specifies the default parallelism and parallel adaptation of the instance and the load leveling algorithm. It indicates the number of processes or threads that a CPU can process during parallel execution.
Partition_view_enabled: This parameter specifies whether the optimizer uses a partitioned view. Oracle recommends that users use partitioned tables (which are introduced after Oracle8) rather than partitioned views. Partitioned views are only intended to provide back compatibility for Oracle.
Revovery_parallelism: This parameter specifies the number of processes to use when recovering the database system.
4. Oracle the parallel execution feature
The vast majority of RDBMS operations can be divided into the following 3 categories:
CPU-bound operations: This type of operation is as fast as a single CPU. by parallelization, multiple CPUs can handle the system load in parallel because the operation can be done faster.
I/O-constrained operations: This type of operation takes most of the time to wait for the system to complete I/O operations. most RAID controllers work well when multiple I/O requests occur at the same time in the system. Also, when a thread waits for an I/O operation, it can take full advantage of the CPU to process the CPU portion of another thread .
Action restricted by competition: Parallel processing does not improve operations that are restricted by resource contention.
5, we should consider the degree of parallelism according to the following factors:
CPU capacity of the computer : thenumber and capacity of the CPU will affect the number of query processes that the system can run.
The ability of the system to handle a large number of processes: Some operating systems can handle many concurrent threads, while others do not.
System load: If the system is now running at its limit, the degree of parallelism will not be much more effective. If the system is running at 90% of its capacity limit , then most of the query processes will overwhelm the system.
Number of queries processed by the system: if most of the operation of the system is an update operation, but there are still a small number of important queries present, the developer may want the system to run multiple query processes.
System I/O capability: If the data on the disk is fragmented or is stored using a disk array, the system can handle multiple parallel queries.
Operation type: Whether the system needs to handle many full table scans or sorting: Parallel query servers are very helpful for such operations.
6, about the degree of parallelism of some suggestions:
Operations that require large amounts of CPU resources, such as sorting, should use a lower degree of parallelism. The reason for this is that this type of CPU-bound operation is already taking full advantage of the CPU without waiting for the system's I/O operations.
Operations that require large amounts of disk I/O, such as full table scans, should have a high degree of parallelism. The more operations you need to wait for disk I/O, the better the system will benefit from parallel operations.
If there are a large number of concurrent processes in the system, a lower degree of parallelism should be used. Because too many processes will overwhelm the system.
7.Oracle read multiple pieces of data at the same time
When the system performs a table scan,Oracle has the ability to simultaneously read multiple blocks of data, which improves the system's I/O speed. By simultaneously reading multiple pieces of data,Oracle is able to read larger chunks of data from disk, thus avoiding the search for data on disk. By reducing disk search and reading larger chunks of data, you can reduce the I/O overhead and CPU overhead of the system .
8. Oracle the partition
Partitioning scheme:
Range Partitioning: This scheme partitions data in a table based on the range of data, such as month, year, and so on.
List Partitioning: This scheme is similar to the range partitioning partitioning scheme, but it is partitioned according to the value of the data rather than the range of the data.
Hash partitioning: This partitioning scheme uses hash functions to automate the partitioning of data.
Sub-partitioning: This approach is a composite partitioning approach that developers are familiar with. This approach allows developers to use multiple partitioning schemes at the same time.
Partitioning has the following benefits:
Partitioning can reduce I/O operations and CPU usage when scanning large-size tables that can be partitioned .
Data can be loaded at the hierarchy of partitions rather than at the table level.
You can delete data by deleting partitions without using the SELECT statement to delete large amounts of data.
Partitions are completely transparent to users and applications.
Data can be maintained at the partition level rather than at the table level.
9.Oracle multi-threaded server
Users can connect to an Oracle instance through a dedicated server process , or they can connect to an Oracle instance through a multithreaded server process . Because each dedicated server process consumes a large amount of memory resources and system resources, it is necessary to use multi-threaded server processes for multiuser connections.
The multithreaded server process allows multiple users to use a certain number of shared server processes. The shared server process uses a shared buffer pool to queue user requests and return data, which greatly reduces CPU and memory usage.
10. Oracle Fault Diagnosis
Database Troubleshooting performs performance data by obtaining system SQL statements, such as the average time that each SQL statement executes in the Oracle database, to identify where the problem occurs.
In order to analyze the location of the failure, the diagnostic data (Oracle Diagnostics) is associated with the transaction execution response time (Transaction Response times) data. For example: The average response time for a transaction "enter" is high, and using a fault diagnosis (Oracle diagnostics), you can find out what caused the problem.
System Bottleneck Analysis