Latch is a lightweight serial locking mechanism used to protect the Oracle shared memory structure and ensure consistency of concurrent user operations. For example, various data in SGA is repeatedly read from the disk to the memory, it is re-written back to the disk. If a concurrent user does the same thing, Oracle must use a mechanism to ensure that data can only be read by one session, this is latch. Latch will not cause blocking, but will only wait. There is a clear process associated with each latch. When the process holding latch becomes a dead process, the system cleanup process will be called. The system lock will cause users to wait. You need to consider whether there is a problem with the system's logic design, such as the deletion or modification of the primary key by multiple users, and whether users use select... For update syntax, whether the foreign key creates an index. For latch contention, system and database design issues, such as binding variables, hot blocks, and parameter settings, must be considered properly.
Spin: for example, if a block in the data cache is to be read, we will obtain the latch of this block. This process is called spin. Another process just needs to modify this block, he also needs the spin block. At this time, he must wait. The current process can only stay in the spin after latch is released, and then modify it. If multiple processes request at the same time, there will be competition between them, there is no queuing mechanism,
Once the previous process has been released, the subsequent process will be congested, and there is no concept of first-in-first-out, and all this will happen very fast, because latch features fast and short-lived.
Spin and sleep:
Sleep means temporarily giving up the CPU and performing context switch. In this way, the CPU needs to save some status information during the running of the current process, such as the stack, semaphore, and other data structures, then introduce the status information of subsequent processes, and switch back to the original Process status after processing. If this process frequently occurs in a high transaction and high-concurrency process processing system, it will be a very expensive resource consumption, so Oracle chooses spin, let the process continue to occupy the CPU, run some empty commands, then continue the request, continue the spin until the value reaches _ spin_count, at this time, the CPU will be discarded for a short period of sleep, and then continue the previous action.
In Oracle, latch is a lightweight lock. Generally, latch consists of three memory elements: PID (process ID), memory address, and memory length. Latch ensures exclusive access to the shared data structure, so as to ensure the integrity of the memory structure is not damaged. When multiple sessions modify or view the same memory structure in SGA at the same time, sequential access is required to ensure the integrity of the data structure in SGA.
The process can obtain latch in two modes: willing-to-wait and no_wait. The no-Wait mode is only used in a few latch instances. The statistical records for obtaining latch in no-Wait mode are stored in the immediate_gets immediate_misses column. These columns exist in the V $ latch, V $ latch_parent, and V $ latch_children views. Generally, the no-Wait mode is used to obtain latch with many sublatch for the first time, such as redo copy. If a process fails to obtain any of these sublatch for the first time, it immediately uses no-Wait mode to ask for the next one. The willing-to-Wait mode is used only when the no-Wait mode fails to be used to obtain all sublatch.
The willing-to-Wait mode is used to obtain latch statistics and store them in the gets and misses columns. Gets increases whenever a process uses the willing-to-Wait mode to obtain a latch.
If the latch is available when the process requests latch for the first time, the latch is obtained directly. Before modifying any protected data structure, the process writes some recovery information to the latch recovery zone. In this way, when the process that obtains latch encounters an exception, the pmon process can clear latch held by the process.
If the latch is unavailable when a request is made, the process waits for a short period of time (spin) in the CPU and then requests the latch again. If latch remains unavailable, the process (spin for a period of time and then request again) will be repeated. The number of repetitions is determined by the implicit parameter _ spin_count. The default value is 2000. If latch is obtained within _ spin_count, add one to each of the spin_gets and misses columns. Otherwise, process v $ session_wait records the latch free wait event and then releases the CPU, go to sleep status. After sleep for a certain period of time, the process is awakened and repeat the above process until latch is obtained. After the latch is obtained successfully, the sleep column is updated.
Because the process stops requests to latch only after obtaining latch, what if an exception occurs to a process holding latch and other processes requesting the latch? Isn't it necessary to keep waiting? No. When a process fails to request latch for a certain number of times, it will request the pmon process to view the owner of the latch. If the holding process is abnormal, pmon will clear the request and release the latch.
Each latch has a priority number ranging from 0 to 13. The priority numbers of the parent latch and independent latch are fixed in the Oracle kernel code. The sublatch is created by the worker when the instance is started. Its priority number is inherited from its parent latch. Priority can be used to avoid deadlocks.
When a process requests latch in no-Wait mode, the priority number of the latch must be the same as the priority number of the latch it currently holds.
When a process requests latch in willing-to-Wait mode, the priority number of the latch must be greater than the priority number of the latch it currently holds.
The process of obtaining latch:
At any time, only one process can access a data block in the memory. If the process cannot obtain latch because another process is occupying a block, it will perform a spin (rotation) on the CPU ), the time is very short. After spin is obtained, the spin is still obtained. If the spin is unsuccessful, the spin is still obtained until the number of spin times reaches the threshold (this is specified by the implicit parameter _ spin_count). At this time, the spin is stopped, for short-term sleep, after sleep, the previous action will continue until the latch on the block is obtained. There is also an algorithm for the process sleep time. It will increase with the number of spins, in the unit of per second. The sleep threshold value is controlled by the implicit parameter _ max_exponential_sleep. The default value is 2 seconds, if the current process has occupied another latch, its sleep time will not be too long (too long will cause latch wait for other processes). The maximum sleep time is determined by the implicit parameter _ max_sleep_holding_latch, the default value is 4 S. The sleep of this time limit is also called short-term waiting. Another case is that the latch wait posting wait for a long time. When the process fails to wait for latch to request, the system enters sleep and pushes a signal to the latch wait list, obtain latch requests. When the occupying process releases latch, it checks the latch wait list and sends a signal to the requesting process to activate the sleeping process. Latch wait list is a list of processes maintained in the SGA area. It also needs latch to ensure its normal operation. By default, the share pool latch and library cache latch adopt this mechanism.
If the implicit parameter _ latch_wait_posting is set to 2, all latch uses this waiting method, which can accurately wake up a waiting process, however, maintaining the latch wait list requires system resources, and the competition for latch on the latch wait list may also lead to bottlenecks.
Data Buffer Pool latch contention
Frequently accessed data blocks are called Hot blocks. When many users access several data blocks together, some latch contention occurs. The most common latch contention includes:
(1) buffer busy waits
(2) cache buffer chain
Cache buffer Chian cause:
When a session needs to access a memory block, it first needs to search for the block in the memory in a structure like a linked list, when a session accesses this linked list, it needs to obtain a latch. If the query fails, latch cache buffer chain waits, the reason for this wait is that there are too many sessions accessing the same data block or the list is too long (if too much data is read in the memory, the hash list for managing the data block will be very long, in this way, the time for the session scan list will increase, the time for holding the chache buffer chain latch will become longer, and the chance for other sessions to obtain this latch will decrease, and the waiting time will increase ).
Cause of buffer busy WAITS:
When a session needs to access a data block, which is being read by another user from the disk to the memory or is being modified by another session, the current session needs to wait, and a buffer busy waits will be generated.
The direct cause of these latch contention is that too many sessions access the same data block, resulting in hot and fast issues, the cause of the heat may be that the database settings or repeated SQL statements frequently access the same data block.
View latch buffer cache chains wait events
Select distinct A. Owner, A. segment_name
From dba_extents,
(Select dbarfil, dbablk
From x $ BH
Where hladdr in (select ADDR
From (select ADDR
From v $ latch_children
Order by sleeps DESC)
Where rownum <20) B
Where a. relative_fno = B. dbarfil
And a. block_id <= B. dbablk
And a. block_id + A. Blocks> B. dbablk;
Queries the busiest buffer of the current database. Tch (touch) indicates that the higher the number of accesses, the faster the hot spot competition will occur.
Select *
From (select ADDR,
TS #,
File #,
Dbarfil,
Dbablk,
Tch
From x $ BH
Order by tch DESC)
Where rownum <11;
Queries the busiest buffer in the current database. Combined with dba_extents queries, obtains the objects from which these hot buffers come from.
Select E. Owner, E. segment_name, E. segment_type
From dba_extents e,
(Select *
From (select ADDR, TS #, file #, dbarfil, dbablk, TCH
From x $ BH
Order by tch DESC)
Where rownum <11) B
Where E. relative_fno = B. dbarfil
And E. block_id <= B. dbablk
And E. block_id + E. Blocks> B. dbablk;
If the latch free hotspot block event is found in top 5, you can query the specific sublatch information from V $ latch_children.
Select *
From (select ADDR, child #, gets, misses, sleeps, immediate_gets igets,
Immediate_misses imiss, spin_gets sgets
From v $ latch_children
Where name = 'cache buffers chains'
Order by sleeps DESC)
Where rownum <11;
Obtain latch and buffer information of the most popular data blocks currently held.
Select B. ADDR, A. Ts #, A. dbarfil, A. dbablk, A. Tch, B. Gets, B. Misses, B. Sleeps
From (select *
From (select ADDR, TS #, file #, dbarfil, dbablk, tch, hladdr
From x $ BH
Order by tch DESC)
Where rownum <11),
(Select ADDR, gets, misses, sleeps
From v $ latch_children
Where name = 'cache Buffers' chains ') B
Where a. hladdr = B. ADDR;
The preceding SQL statements can be used to find the object information of these hotspot buffer.
Select distinct E. Owner, E. segment_name, E. segment_type
From dba_extents e,
(Select *
From (select ADDR, TS #, file #, dbarfil, dbablk, TCH
From x $ BH
Order by tch DESC)
Where rownum <11) B
Where E. relative_fno = B. dbarfil
And E. block_id <= B. dbablk
And E. block_id + E. Blocks> B. dbablk;
In combination with the SQL view, you can find the related SQL statements that operate on these objects, and then reduce data access by optimizing SQL statements, or optimize some competing operations (such as connect) to reduce hotspot block competition
Break on hash_value skip 1
Select/* + rule */hash_value, SQL _text
From v $ sqltext
Where (hash_value, address) in (
Select a. hash_value, A. Address
From v $ sqltext,
(Select distinct A. Owner, A. segment_name, A. segment_type
From dba_extents,
(Select dbarfil, dbablk
From (select dbarfil, dbablk
From x $ BH
Order by tch DESC)
Where rownum <11) B
Where a. relative_fno = B. dbarfil
And a. block_id <= B. dbablk
And a. block_id + A. Blocks> B. dbablk) B
Where a. SQL _text like '%' | B. segment_name | '%'
And B. segment_type = 'table ')
Order by hash_value, address, piece;