Common Oracle wait events

Source: Internet
Author: User
Tags metalink

Oracle wait events are an important basis and indicator for measuring the operating status of oracle. The concept of wait events is introduced in oracle7.0.1.2, with approximately 100 wait events. In Oracle 8.0, this number was increased to approximately 150, with approximately 200 events in Oracle8i and about  waiting events in Oracle9i. There are two main types of wait events, namely, idle wait events and non-idle wait events.

Idle events refer to Oracle waiting for some kind of work. When diagnosing and optimizing databases, we don't need to pay too much attention to these events.

Common idle events include:

• Dispatcher Timer

• Lock element cleanup

• Null event

• Parallel query dequeue wait

• Parallel query idle wait-slaves

• Pipe get

• PL/SQL lock Timer

• Pmon timer-pmon

• Rdbms ipc message

• Slave Wait

• SMON Timer

• SQL * Net break/reset to client

• SQL * Net message from client

• SQL * Net message to client

• SQL * Net more data to client

• Virtual circuit status

• Client message

Non-idle wait events are specific to Oracle activities. They are the waiting events that occur during database tasks or application running. We should pay attention to and study these waiting events when adjusting the database.

Some common non-idle wait events include:

• DB file scattered read

• DB file sequential read

• Buffer busy waits

• Free buffer waits

• Enqueue

• Latch free

• Log File parallel write

• Log File sync

1. DB file scattered read-DB file distributed read

This usually shows the wait related to the full table scan. When the database performs a full table scan, the data will be distributed (scattered) into the buffer cache based on performance considerations. If the wait event is significant, it may indicate that no index is created or a proper index is not created for some tables scanned by the full table, we may need to check whether these data tables are correctly set.

However, this wait event does not necessarily mean low performance. In some conditions, Oracle will take the initiative to use full table scanning to replace index scanning to improve performance, which is related to the amount of data accessed, in CBO, Oracle performs more intelligent selection. In RBO, Oracle prefers to use indexes.

Because full table scan is placed on the cold end (cold end) in the LRU (least recently used, least recently applied) list, for small data tables with frequent access, you can cache them to the memory to avoid repeated reads.

When the wait event is significant, you can use the V $ session_longops dynamic performance view to diagnose the event. This view records the tasks that run for a long time (more than 6 seconds, most of the operations may be full table scan (in any case, this part of information is worth our attention ).

2. DB file sequential read-DB file sequential reading.

This event usually shows Read operations (such as index reading) related to a single data block ). If this wait event is significant, it may indicate that there is a problem with the table connection sequence in the Multi-Table connection, and the driver table may not be properly used; or it may indicate that the table is indexed without selection.

In most cases, we say that records can be obtained more quickly through indexes. Therefore, it is normal to wait for a database with good coding specifications and adjustments. However, in many cases, using indexes is not the best choice. For example, if you read a large amount of data from a large table, full table scanning may be significantly faster than index scanning, therefore, during development, we should note that such queries should be avoided using index scanning.

3. Free buffer-Release the buffer

This wait event indicates that the system is waiting for available space in the memory, which indicates that there is no free memory space in the current buffer. If the application is well designed, the SQL writing specification, and the variables are fully bound, this wait may indicate that the buffer cache settings are too small, and you may need to increase db_buffer_cache.

The free buffer wait may indicate that the writing speed of dbwr is not enough, or there is a serious competition on the disk. You can consider adding checkpoints, using more dbwr processes, or increasing the number of physical disks, distribute loads and balance Io.

4. Buffer busy-buffer busy

The waiting event indicates that the system is waiting for a buffer in the unrecoverable able mode, or that it is currently being read into the buffer cache. Generally, the buffer busy wait should not be greater than 1%. Check the buffer wait Statistics Section (or V $ waitstat) to see if the waiting is in the segment header ). If yes, you can consider adding a Free List (freelist for Oracle8i DMT) or adding a freelist groups (in many cases, this adjustment is immediate. Before 8.1.6, this freelists parameter cannot be dynamically modified; in versions 8.1.6 and later, to dynamically modify feelists, you must set compatible to at least 8.1.6 ).

If this wait is in the Undo header, you can add a rollback segment to solve the buffer issue. If the data is waiting on the Undo block, we may need to check the relevant applications to reduce large-scale consistent reading, or to reduce the data density in the table of consistent reading (consistent read) or increase db_cache_size.

If you are waiting for a data block, you can consider moving frequently accessed tables or data to another data block or for a larger range of distribution (you can increase the pctfree value and expand the data distribution, reduce competition) to avoid this "Hotspot" data block, or you can consider adding a free list in the table or using a locally managed tablespace (locally managed tablespaces ).

If you are waiting for the index block, you should consider re-indexing, splitting the index, or using the reverse key index. To prevent data block-related buffering and waiting, you can also use smaller blocks: in this case, there are fewer records in a single block, therefore, this block is not so "busy"; or you can set a larger pctfree to expand the physical distribution of data and reduce hot competition between records.

When executing DML (insert/update/delete), Oracle writes information to the data block. For data tables with concurrent access to multiple transactions, competition and waiting for ITL may occur, to reduce this wait, you can add initrans and use multiple ITL slots. In Oracle9i, a new concept is introduced: assm (segment space management auto ). Using this new feature, Oracle uses bitmap to manage space usage.

The combination of assm and LMT completely changes the storage mechanism of Oracle. The bitmap freelist can reduce the buffer busy wait (buffer busy wait). This problem was a serious problem in earlier versions of Oracle9i.

Oracle claims that assm significantly improves the performance of DML concurrent operations, because different parts of the bitmap can be used at the same time, thus eliminating the serialization of searching for the remaining space. Based on the test results of Oracle, the use of the bitmap freelist will eliminate competition for all segment headers (resources) and achieve ultra-fast concurrent insert operations. Among Oracle9i, buffer busy wait is no longer common!

5. Latch free-latch release

Latch is a low-level queuing mechanism used to protect the shared memory structure in SGA. Latch is like a memory lock that is quickly acquired and released. It is used to prevent the shared memory structure from being accessed by multiple users at the same time. If latch is unavailable, latch release failure is recorded ). There are two types related to latches:

■ Immediately.

■ Wait.

If a process tries to obtain a latch in immediate mode, and the latch is held by another process, if the latch cannot be available, then the process will not wait to get the latch. It will continue to perform another operation.

Most latch problems are related to the following operations:

It is not very good to use the library cache latch, redo allocation latch, and cache buffers LRU chain ), and cache buffers chain in buffer cache ).

Generally speaking, if we want to design a failed system without considering variable binding, this condition is enough. For systems with high heterogeneity, the consequence of not binding variables is extremely serious.

In addition, there are some latch waiting to be related to bugs. You should pay attention to the announcement of Metalink-related bugs and the release of patches. This issue should be investigated when latch Miss ratios is greater than 0.5%.

The latch mechanism of Oracle is competition. Its processing is similar to the CSMA/CD in the network. All user processes compete for latch and are willing to wait for the latch of the willing-to-Wait type, if a process does not obtain latch in its first attempt, it will wait and try again. If it fails to obtain latch after _ spin_count contention, the process is transferred to sleep state, it lasts for a specified period of time, then wakes up again, and repeats the previous steps in sequence. in 8i/9i, the default value is _ spin_count = 2000.

If the SQL statement cannot be adjusted, in version 8.1.6 or later, Oracle provides a new initialization parameter: cursor_sharing can force variable binding on the server by setting cursor_sharing = force. Setting this parameter may cause some side effects. For Java programs, there are related bugs. Specific applications should pay attention to the Metalink bug announcement.

7. Log buffer space-log buffer space

When you write the redo log generated by log buffer faster than that written by lgwr, or when the log switch is too slow, this wait occurs. When this wait occurs, it usually indicates that the redo log buffer is too small. To solve this problem, you can increase the size of the log file or the size of the log buffer.

Another possible cause is the bottleneck in disk I/O. You can consider using a disk with a faster write speed. You can use bare devices to store log files as allowed to improve writing efficiency. In general systems, the lowest standard is not to store log files and data files together, because generally, log files are only written and not read, and separated for storage can improve performance.

8. Log File switch-Log File Switch

When this wait occurs, all the submitted (COMMIT) requests must wait until the "Log File switching" is complete.

Log File switch contains two subevents:

Log File switch (archiving needed)

Log File switch (checkpoint incomplete)

Log File switch (archiving needed)

This wait event usually occurs because the first log archive is not completed after the log Group is fully written. This wait may indicate Io problems. Solution:

You can consider increasing the log file and adding a log group.

Move an archive file to a quick Disk

Adjust log_archive_max_processes.

Log File switch (checkpoint incomplete)-log switch (the checkpoint is not completed)

After all your log groups are written, lgwr tries to write the first log file, if the database does not complete writing records in the dirty block of the first log file (for example, the first checkpoint is not completed), this waits for the event to appear.

This wait event usually indicates that your dbwr writing speed is too slow or there is a problem with Io.

To solve this problem, you may need to consider adding additional dbwr or increasing the size of your log group or log file.

9. Log File sync-log file synchronization

When a user submits or rolls back data, lgwr writes session redo logs from the log buffer to the redo log. The log file synchronization process must wait until the process is completed successfully. To reduce such waiting events, you can try to submit more records at a time (frequent submission will bring more system overhead ). Place the redo logs on a fast disk, or use the redo logs on different physical disks to reduce the effect of archiving on lgwr.

For soft RAID, RAID 5 is generally not used. RAID 5 may cause high performance loss for systems with frequent writes. You can consider using a file system for direct input/output, you can also use raw device to improve the write performance.

10. log file single write this event is only related to the header block of the log file. It usually occurs when a new group member is added and the serial number is increased.

Write a single header block because the part of the header block information is the file number, each file is different. The operation to update the log file header is completed in the background and rarely waits.

11. Log File parallel write

Writing redo records from log buffer to redo log files mainly refers to regular write operations (relative to log file sync ). If your log group has multiple group members, write operations are parallel when the log buffer is flushed. In this case, this wait event may occur.

Although this write operation is processed in parallel, it will not be completed until all I/O operations have completed the write operation (If your disk supports asynchronous Io or using Io slave, so even if there is only one redo log file Member, this wait may occur ).

This parameter is compared with the log file sync time to measure the write cost of log file. It is usually referred to as the synchronization cost rate.

12. Control File parallel write-control file parallel write

This event may occur when the server process updates all control files. If the wait time is short, you do not need to consider it. If the wait time is long, check whether the physical disk I/O storing the control file has a bottleneck.

Multiple control files are identical copies for images to improve security. For business systems, multiple control files should be stored on different disks. Generally, three are sufficient. If there are only two physical hard disks, the two control files are acceptable. Saving multiple control files on the same disk is meaningless. To reduce the waiting time, consider the following methods:

Reduce the number of control files (to ensure security)

If the system supports asynchronous Io

Transfer control files to physical disks with low I/O burden

13. When the control file sequential read/control file single write control file continuous read/control file single write to a single control file I/O problem exists, the two events will occur. If the waiting time is obvious, check a single control file to check whether the storage location has an I/O bottleneck.

14. direct path write-direct path write wait occurs, and the system waits to confirm that all unfinished asynchronous I/O has been written to the disk. For this write wait, we should find the data files with the most frequent I/O operations (if there are too many sort operations, it is likely to be temporary files), scattered load, accelerate write operations.

If the system has too many disk sorts, temporary tablespace operations will occur frequently. In this case, you can use local to manage the tablespace, divide it into multiple small files, and write it to different disks or bare devices.

16. Idle event-idle event

Finally, let's look at several idle waiting events. In general, idle waiting means that the system waits for nothing or waits for user requests or responses. Generally, we can ignore these waiting events. Idle events can be queried in the stats $ idle_event table.

Let's take a look at the main idle waiting events of the system. We should have a general impression on these events. If the top 5 waiting events are mostly these events, in general, your system is relatively idle.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.