1. Latch description
1.1 Latch
Latch is a serial lock mechanism used to protect the shared data structure in the SGA zone. The implementation of latch is related to the operating system, especially whether a process needs to wait for a latch and how long it will take.
Latch is a lightweight lock resource provided by Oracle. It is a lock that can be acquired and released extremely quickly and can lock resources in a short time, prevent multiple concurrent processes from simultaneously modifying and accessing a shared resource. It only works in SGA and is usually used to protect the data structure of blocks in the buffer cache.
For example, in SGA, various types of data are repeatedly read from the disk to the memory and re-written back to the disk. If concurrent users do the same thing, Oracle must use a mechanism, to ensure that data can only be read by one session. This protection mechanism is latch.
Concurrency: more than two users modify the same data (including insertion, deletion, and modification ).
Parallel: divides a task into many small parts, allows each part to be executed simultaneously, and finally summarizes the execution results into the final results.
There is also a cleanup process associated with each latch. When the process holding the latch becomes a dead process, the cleanup process will be called. Latch also has a correlation level to prevent deadlocks. Once a process gets a latch at a certain level, it cannot obtain latch equal to or below this level.
Latch will not cause blocking, but will only cause waiting. Blocking is a system design issue, while waiting is a system resource contention issue.
1.2 SPin description:
For example, if a block in the data cache is to be read, we will obtain the latch of this block. This process is called spin. Another process happens to need to modify this block, and it also needs to spin this block, at this time, he must wait until the current process releases latch before spin can survive, and then modify it. If multiple processes request at the same time, there will be competition between them, and there is no queuing mechanism, once the previous process has been released, the subsequent process will be congested, and there is no concept of first-in-first-out, and all this will happen very fast, because Latch features fast and short-lived.
SPIN and sleep:
Sleep means temporarily giving up the CPU and performing context switch. In this way, the CPU needs to save some status information during the running of the current process, such as the stack, semaphore, and other data structures, then introduce the status information of subsequent processes, and switch back to the original Process status after processing. If this process frequently occurs in a high transaction and high-concurrency process processing system, it will be a very expensive resource consumption, so Oracle chooses spin, let the process continue to occupy the CPU, run some empty commands, then continue the request, continue the spin until the value reaches _ spin_count, at this time, the CPU will be discarded for a short period of sleep, and then continue the previous action.
1.3 process for obtaining Latch:
At any time, only one process can access a data block in the memory. If the process cannot obtain Latch because another process is occupying a block, it will perform a spin (rotation) on the CPU ), the time is very short. After spin is obtained, the spin will continue to be obtained. If it fails, the spin will still be obtained until the number of spin times reaches the threshold (this is specified by the implicit parameter _ spin_count). At this time, the spin will be stopped, for short-term sleep, after sleep, the previous action will continue until the Latch on the block is obtained.
There is also an algorithm for the process sleep time. It will increase with the number of spin times, in the unit of seconds, such ,... The sleep threshold is controlled by the implicit parameter _ max_exponential_sleep. The default value is 2 seconds. If the current process has occupied another Latch, the sleep duration is not too long (too long will cause Latch wait for other processes). The maximum sleep duration is determined by the implicit parameter _ max_sleep_holding_latch. The default value is 4 seconds. The sleep of this time limit is also called short-term wait.
Another case is that the Latch Wait Posting Wait for a long time. When the process fails to Wait for Latch to request, the system enters sleep and pushes a signal to the Latch Wait List, obtain Latch requests. When the occupying process releases Latch, it checks the Latch Wait List and sends a signal to the requesting process to activate the sleeping process. Latch Wait List is a List of processes maintained in the SGA area. It also needs Latch to ensure its normal operation. By default, share pool latch and library cache
Latch uses this mechanism.
If the implicit parameter _ latch_wait_posting is set to 2, all Latch uses this waiting method, which can accurately wake up a waiting process, however, maintaining the Latch Wait List requires system resources, and the competition for Latch on the Latch Wait List may also lead to bottlenecks.
If it takes a long time for a process to request, rotate, and sleep Latch, it will notify the PMON process to see if the process occupied by Latch has ended or died unexpectedly, if yes, PMON will clear the Latch resources occupied.
In short, the Latch acquisition process: Request-SPIN-sleep-request-SPIN-sleep ...... occupies.
1.4 Latch and Lock
In a sense, Latch is the resource Lock in memory, and the Lock of database objects (tables, indexes, etc.) is called Lock.
Latch and Lock:
(1 ). latch is a mechanism that provides mutex access to the memory data structure, and Lock is used to set shared resource objects in different modes. Each mode is compatible or exclusive. From this point, we can see that, latch access, including queries, is also mutually exclusive. At any time, only one process can pin a part of the memory. Fortunately, this process is quite short, otherwise, the system performance will not be guaranteed. Starting from 9I, multiple processes are allowed to query the same memory block at the same time.
(2) Latch only acts on the memory. It can only be accessed by the current instance, and Lock acts on the database object. In the RAC system, Lock detection and access are allowed between instances.
(3) Latch is instantly occupied. Release and Lock must wait until the transaction ends correctly. The length of time it takes is determined by the transaction size.
(4). latch is not in the queue, while lock is in the queue.
(5). Latch does not have a deadlock, but lock does.
Ii. Latch contention
If you find that the system is often waiting due to lock, you need to consider whether the system has any problems in the logic design, such as the deletion or modification of the primary key by multiple users, whether the user uses the select... For update syntax, whether the foreign key creates an index. These factors need to be combined with the business logic of the system to design database objects.
If it is found that the system is slow because of many latch contention, it is necessary to consider whether there is a problem in the system and database design, such as whether to use the bound variable, whether there is a hot and fast, whether the data storage parameter design is reasonable or not.
There are many reasons for latch contention and waiting, and many resources in the memory may compete for use. The two most common latch competitions are as follows:
(1) latch contention in the sharing pool.
(2) latch contention in the data buffer pool.
2.1 latch contention in the sharing pool
If a large number of SQL statements are repeatedly analyzed in the shared pool, it will lead to great Latch contention and long wait. The most common phenomenon is that no variables are bound.
The most common Latch in a centralized shared pool is the library cache. You can use SQL to query:
SQL> select * from v $ latchname where name like 'library cache % ';
LATCH # NAME HASH
----------------------------------------------------------------------
217 library cache 3055961779
218 library cache lock 916468430
219 library cache pin 2802704141
220 library cache pin allocation 4107073322
221 library cache lock allocation 3971284477
222 library cache load lock 2952162927
223 library cache hash chains 1130479025
When analyzing system performance, if we see Latch contention like library cache, we can conclude that there is a problem in the shared pool. This problem is basically caused by SQL statements, for example, no variables are bound or some stored procedures are analyzed repeatedly.
You can view the resource contention using the following SQL:
SQL> select event, count (*) from v $ session_wait group by event;
Event count (*)
--------------------------------------------------------------------------
SQL * Ne tmessage from client 4
Streams AQ: waiting for messages in the queue 1
ASM background timer 1
Gcs remote message 1
Ges remote message 1
Jobq slave wait 1
Rdbms ipc message 14
Smon timer 1
Pmon timer 1
Streams AQ: qmn slave idle wait 1
Class slave wait 1
SQL * Net message to client 1
Streams AQ: Waiting for time management or cleanup tasks 1
Streams AQ: qmn Coordinator idle wait 1
DIAG idle wait 1
15 rows selected.
2.2 Data Buffer Pool Latch contention
Frequently accessed data blocks are called Hot blocks. When many users access several data blocks together, some Latch contention occurs. The most common latch contention includes:
(1) buffer busy waits
(2) cache buffer chain
The competition between the two latchs takes place at different times of accessing data blocks.
Cache buffer Chian cause:
When a session needs to access a memory block, it first needs to search for the block in the memory in a structure like a linked list, when a session accesses this linked list, it needs to obtain a latch. If the query fails, latch cache buffer chain waits, the reason for this wait is that there are too many sessions accessing the same data block or the list is too long (if too much data is read in the memory, the hash list for managing the data block will be very long, in this way, the time for the session scan list will increase, the time for holding the chache buffer chain latch will become longer, and the chance for other sessions to obtain this latch will decrease, and the waiting time will increase ).
Cause of buffer busy WAITS:
When a session needs to access a data block, which is being read by another user from the disk to the memory or is being modified by another session, the current session needs to wait, and a buffer busy waits will be generated.
The direct cause of these latch contention is that too many sessions access the same data block, resulting in hot and fast issues, the cause of the heat may be that the database settings or repeated SQL statements frequently access the same data block.
Latch is a simple, low-level serialization technology used to protect the shared data structure in SGA, such as the concurrent user list and blocks information in the buffer cache. A server process or background process must obtain the corresponding latch before starting operations or looking for a shared data structure, and release latch after completion. It is not necessary to optimize latch itself. If latch is competitive, it indicates that part of the SGA is experiencing abnormal resource usage.
Iii. Check Latch-related SQL
3.1 view the latch buffer cache chains wait event hot and fast
Select distinct a. owner, a. segment_name
FROM dba_extents,
(SELECT dbarfil, dbablk
FROM x $ bh
WHERE hladdr IN (SELECT addr
FROM (SELECT addr
FROM v $ latch_children
Order by sleeps DESC)
Where rownum <20) B
WHERE a. RELATIVE_FNO = B. dbarfil
AND a. BLOCK_ID <= B. dbablk
AND a. block_id + a. blocks> B. dbablk;
3.2 query the busiest Buffer of the current database. TCH (Touch) indicates that the higher the number of visits, the faster the hot spot competition will occur.
SELECT *
FROM (SELECT addr,
Ts #,
File #,
Dbarfil,
Dbablk,
Tch
FROM x $ bh
Order by tch DESC)
Where rownum <11;
3.3 query the busiest Buffer in the current database and use dba_extents to query the objects from which these hot buffers come from.
SELECT e. owner, e. segment_name, e. segment_type
FROM dba_extents e,
(SELECT *
FROM (SELECT addr, ts #, file #, dbarfil, dbablk, tch
FROM x $ bh
Order by tch DESC)
Where rownum <11) B
WHERE e. relative_fno = B. dbarfil
AND e. block_id <= B. dbablk
AND e. block_id + e. blocks> B. dbablk;
3.4 If latch free hotspot block events are found in Top 5, you can query the specific sublatch information from V $ latch_children.
SELECT *
FROM (SELECT addr, child #, gets, misses, sleeps, immediate_gets igets,
Immediate_misses imiss, spin_gets sgets
FROM v $ latch_children
Where name = 'cache buffers chains'
Order by sleeps DESC)
Where rownum <11;
3.5 obtain Latch and buffer information of the most popular data blocks currently held
SELECT B. addr, a. ts #, a. dbarfil, a. dbablk, a. tch, B. gets, B. misses, B. sleeps
FROM (SELECT *
FROM (SELECT addr, ts #, file #, dbarfil, dbablk, tch, hladdr
FROM x $ bh
Order by tch DESC)
Where rownum <11),
(SELECT addr, gets, misses, sleeps
FROM v $ latch_children
Where name = 'cache Buffers' chains ') B
WHERE a. hladdr = B. addr;
3.6 using the preceding SQL statement, you can find the object information of these hotspot Buffer.
SELECT distinct e. owner, e. segment_name, e. segment_type
FROM dba_extents e,
(SELECT *
FROM (SELECT addr, ts #, file #, dbarfil, dbablk, tch
FROM x $ bh
Order by tch DESC)
Where rownum <11) B
WHERE e. relative_fno = B. dbarfil
AND e. block_id <= B. dbablk
AND e. block_id + e. blocks> B. dbablk;
3.7 in combination with the SQL view, you can find the related SQL statements that operate on these objects, and then reduce data access by optimizing SQL statements, or optimize some competing operations (such as connect) to reduce hotspot block competition
Break on hash_value skip 1
SELECT/* + Rule */Hash_value, SQL _text
FROM v $ sqltext
WHERE (hash_value, address) IN (
SELECT a. hash_value, a. address
FROM v $ sqltext,
(Select distinct a. owner, a. segment_name, a. segment_type
FROM dba_extents,
(Select dbarfil, dbablk
From (select dbarfil, dbablk
From x $ BH
Order by tch DESC)
Where rownum <11) B
WHERE a. relative_fno = B. dbarfil
AND a. block_id <= B. dbablk
AND a. block_id + a. blocks> B. dbablk) B
WHERE a. SQL _text LIKE '%' | B. segment_name | '%'
AND B. segment_type = 'table ')
Order by hash_value, address, piece;