deep understanding of latch in Oracle
Serialization Overview
Serialization-the database system itself is a multi-user concurrency processing system, at the same point in time, there may be multiple users simultaneously operating the database, multiple users simultaneously write data in the same physical location, can not occur in the case of mutual coverage, which is called serialization, serialization will reduce the concurrency of the system, However, this is necessary to protect data structures from being compromised. In the Oracle database, these two issues are addressed through a latch (latch) and lock.
Latches and locks have both the same point and different points. The same point is that they are all resources used to implement serialization. The difference is that the latch (Latch) is a low-level, lightweight lock that gets and releases quickly, and is implemented in a manner similar to a semaphore. Locking (lock) can last for a long time, by using a queue, in a first-in, FIFO way. It can also be simply understood that the latch is a microscopic field, while the lock is in the macroscopic domain.
Note : Latch is a serialization locking mechanism for protecting shared data structures in the SGA area. It is not only used for buffer cache, but also for shared pool and log buffer, etc..
Latch (latch) Overview
The Oracle database uses a latch (latch) to manage the allocation and deallocation of SGA memory. Latch is a serialization locking mechanism used to protect shared data structures in the SGA area. The implementation of latch is related to the operating system, especially whether a process needs to wait for a latch and how long it needs to wait.
Latch is a lock that can be acquired and released very quickly, and is typically used to protect the data structure that describes the block in buffer cache. There is also a purge process associated with each latch, which is called when the process that holds the latch becomes the dead process. Latch also has a correlation level that is used to prevent deadlocks, and once a process gets a latch at a certain level, it is impossible to obtain latch equal to or less than that level.
Latch does not cause blocking, it only causes waiting. Blocking is a system design problem, and waiting is a problem of system resource contention.
spin and hibernation
SPIN
Seen in the tuning contention chapter of the performance tuning SG, this is the original: Latch behavior differs on single and multiple CPU servers. On a single CPU server, a process requesting an in-use latch would release the CPU and sleep before trying the latch again. On multiple CPU servers, a process requesting an in-use latch would "spin" on the CPU a specific number of times before re Leasing the CPU and trying again. The number of spins the process would use is OS specific.
Spin is a process that exclusive CPU time until the end of the run. Other processes during this period will not be able to get this CPU running time. There is no spin concept for a single CPU.
For example, a block in the data cache to be read, we will get this block of latch, this process is called spin, another process just want to modify this block, he also spin this block, at this time he must wait, the current process release latch can spin live, and then modify, If multiple processes request at the same time, there will be a competition between them, there is no queue mechanism, once the previous process is released, the subsequent process is swarming, there is no concept of first served, and all this happens very quickly, because the characteristics of latch is fast and short.
Dormancy
Hibernation means to temporarily abandon the CPU, the context switch, so that the CPU to save the current process runtime some state information, such as stack, semaphore and other data structures, and then introduce the status of the subsequent process information, processing and then switch back to the original process state, This process, if frequent in a high transaction, high concurrency processing system inside, will be a very expensive resource consumption, so Oracle chose spin, let the process continue to occupy the CPU, run some empty instructions, then continue the request, continue to spin until the _spin_count value is reached, This will give up the CPU, a short sleep, and then continue the action just now. In the initial state, a process sleeps for 0.01 seconds. Then wake up and try to get latch again. Once the process goes to sleep, a corresponding wait event is thrown and recorded in the view v$session_wait, indicating the type of latch the current process is waiting for.
Types of latch
willing to wait ( willing-to-wait)
Most latch belong to this type (willing-to-wait). This type of latch is implemented by Test-and-set.
at any time, only one process can access a block of data in memory, and if the process cannot get latch because another process is consuming blocks, He will do a spin (rotation) of the CPU, the time is very short, spin continue to acquire, unsuccessful still spin, until the spin number reaches the threshold limit (this is specified by the implied parameter _spin_count), at this time the process will stop spin, for short-term hibernation, The action continues just after hibernation until you get the latch on the block. Process sleep time is also the existence of algorithms, he will increase with the number of spin, in seconds to the unit, such as 1,1,2,2,4,4,8,8, ... The threshold limit for hibernation is controlled by the implied parameter _max_exponential_sleep, which is 2 seconds by default, and if the current process has already occupied another latch, his sleep time will not be too long (too long will cause the latch of other processes to wait), at which time there is an implied parameter _ Max_sleep_holding_latch decides that the default is 4 seconds. This time-limited sleep is also called short-term waiting.
Another case is a long wait for the latch (Latch wait Posting), at this time waiting for the process request Latch unsuccessful, into hibernation, he will wait for the latch list (Latch wait list) pressed into a signal, to obtain Latch request, The latch Wait List is checked when the consuming process releases latch, and a signal is passed to the requested process to activate the dormant process. Latch Wait lists are a list of processes maintained in the SGA area, and he also needs Latch to keep them running, by default share pool Latch and library cache Latch are using this mechanism.
If the implied parameter _latch_wait_posting is set to 2, all latch Use this wait method, which can be used to more accurately wake up a waiting process, but the maintenance latch wait list requires system resources, and the latch wait There may also be bottlenecks in the latch competition on list.
If a process requests, rotates, and sleeps latch for a long time, he notifies the Pmon process to see if the latch's occupancy process has terminated unexpectedly or died, and if so, Pmon clears the latch resource that was consumed.
In summary, latch gets the process: request-spin-Hibernate-Request-spin-hibernate ... Occupied.
No Wait (no-wait)
This type of latch is relatively small, for this type of latch, there will be many latch available. When a process requests one of the latch, the request is started in no-wait mode. If the requested latch is not available, the process does not wait, but immediately requests another latch. Only when all the latch are not available will they enter the waiting.
The latch acquisition process can be used to summarize
650) this.width=650; "Width=" 488 "src=" Http://img14.poco.cn/mypoco/myphoto/20130109/19/5616035720130109190148054.png " height= "473" style= "border:0px;/>
latch and lock
Latch is a mechanism to provide mutually exclusive access to the memory data structure, and lock is a different mode to take the shared resource object, there is compatibility or exclusion between the patterns, from this point of view, Latch access, including the query is also mutually exclusive, at any time, only one process can pin the memory of a piece, Fortunately this process is quite short, otherwise the system performance will not be guaranteed that the
latch is only used in memory, he can only be accessed by the current instance, while lock acts on the database object, allowing lock detection and access between instances in the RAC system
Latch is an instantaneous occupation, release, The release of lock needs to wait until the transaction is finished correctly, the length of time that he occupies is determined by the size of the transaction,
Latch is non-enqueued, and lock is queued
latch There is no deadlock, and lock exists.
use in latch In the process, some exceptions may occur, and some latch are not released by exception, so there will be problems, and other processes can not be requested. Such an exception Pmon process will follow up processing, for its processing of the process, the most important thing is not committed to rollback, then need to latch support recovery, then latch before the beginning of the operation will write some information to latch recovery area. Pmon 3 seconds will run automatically, but this is also a long time, so after the process in the request of a latch failed many times, the post Pmon process to check to see if it is normal to take possession of this latch. latch level
latch levels are divided into 0~14, a total of 15 levels, 0 lowest and 14 highest. If there is a connection between the two latch, only the higher level latch can be requested. The reason is as follows:
If a process occupies a level of 5 latch, it goes to request a level of 3 latch, and process B, occupy the level of 3 latch, and to request that level of 5 latch, so what is the problem? Because it can go to spin, but also can go to sleep, after sleep or continue to repeat, it will never finish. So, level request is level order, can not casually request, only by the lower latch to request advanced latch.
if often a must apply for the latch of process B, you can only discard the latch of the original latch Level5 to reapply for the latch of the B process.
Latch Resource Contention
If the latch resource is being contended, it usually behaves as if the CPU resource is too high.
Conversely, if we find that CPU resources are very tense, the utilization rate is always above 90%, even in 100%, the main reason is the following points.
1, SQL statements if you do not use binding variables, it is easy to create frequent read and write shared pool memory block, if there is a large number of SQL is repeated analysis, it will cause a lot of latch contention and long waiting,Resulting in latch contention in the shared pool associated with parsing SQL. With the shared pool Latch associated with library Cache Latchand the shared Pool Latch. If the database has latch contention above, it is necessary to check if the binding variable is used correctly
2.A data block with very high frequency access is called hot block, and when many users go to a certain block of data together, it causesdata buffer pool latchcontention, the most common latch contention has: Buffer busy waits and cache buffer chain
Cache Buffer Chian:
When a session needs to access a block of memory, it first goes to a list-like structure to search for the data block is in memory, when the session accesses the list needs to obtain a latch, if the acquisition fails, will generate latch cache buffer chain Wait, The reason for this wait is that there are too many sessions to access the same block of data, or that the list is too long (if you read too much data in memory, the hash list that needs to manage chunks of data can be very long so that the session scan list time increases, holding chache buffer chain The latch time will be longer, and the chances of other sessions getting this latch will decrease, and the wait will increase.
Buffer Busy waits:
When a session needs to access a block of data that is being read from disk to memory by another user, or if the block is being modified by another session, the current session needs to wait, creating a buffer busy waits wait.
The direct cause of these latch contention is that too many sessions go to the same block of data to cause hot-fast problems, and the cause of hot-fast may be caused by database settings causing or repeatedly executing SQL frequently accessing some of the same data blocks. The cause of the hot block is different, according to the type of data block, can be divided into the following hot block types, different hot block types are handled differently: Table data block, index data block, index root data block and file header data block.
3, there are also some latch waiting for the bug, should pay attention to the release of Metalink related bugs and patches. Why is latch's contention causing high CPU usage?
In fact, it is easy to understand, such as process a holds latch, at this time process B also need to hold the relevant latch, but not obtained, at this point process B needs to spin, if a process like process B is more, the process of the CPU spin will be more, performance is CPU utilization is very high, But the throughput is very low, the typical "UNIDO not caller"
This article from the "Linux-related technology" blog, declined to reprint!
Deep understanding of latch in Oracle