Dba_oracle event await analysis (concept)

Source: Internet
Author: User
Tags metalink

2014-12-18 Baoxinjian

I. Summary

Oracle's wait events are an important basis for measuring Oracle health and metrics.

The concept of wait events was introduced in Oracle7.0.1.2, with roughly 100 waiting events.

This number is increased to approximately 150 in Oracle 8.0, with approximately 200 events in oracle8i and approximately 360 waiting events in oracle9i.

Second, waiting for the event classification

There are two main categories of wait events, idle (idle) Wait events and non-idle (non-idle) wait events.

1. Idle event means that Oracle is waiting for some kind of work, and we don't have to pay much attention to this part of the event when diagnosing and optimizing the database.

    • Dispatcher Timer
    • Lock element Cleanup
    • Null Event
    • Parallel Query Dequeue Wait
    • Parallel Query Idle Wait-slaves
    • Pipe get
    • PL/SQL Lock timer
    • Pmon Timer-pmon
    • RDBMS IPC Message
    • Slave wait
    • Smon Timer
    • Sql*net Break/reset to Client
    • Sql*net Message from Client
    • Sql*net Message to Client
    • Sql*net more data to client
    • Virtual Circuit status
    • Client message

2. Non-idle wait events are specific to Oracle's activities, which refer to the wait events that occur during database tasks or application runs, which we should focus on when we adjust the database.

    • DB file Scattered read
    • DB file Sequential read
    • Buffer Busy Waits
    • Free buffer waits
    • Enqueue
    • Latch FREE
    • Log file Parallel write
    • Log file Sync

Third, query view

1. View the field structure of the V$event_name view:

sql> desc v$event_name;

is the name empty? type----------------------------------------- -----------------------event# Numberevent_id NumberNAMEVARCHAR2( -) PARAMETER1VARCHAR2( -) PARAMETER2VARCHAR2( -) PARAMETER3VARCHAR2( -) wait_class_id Numberwait_class# NumberWait_classVARCHAR2( -)

2. View the total number of wait events:

SQL>Selectcount(* from v$event_name;   COUNT (*) -- --------      1116

3. View waiting event Classifications

SELECTwait_class#, wait_class_id, Wait_class,COUNT(*) as"Count"     fromV$event_nameGROUP  bywait_class#, wait_class_id, Wait_classORDER  bywait_class#; wait_class# Wait_class_idwait_classCount----------- ------------- -------------------- ----------          01893977003 Other717          14217450380 Application -          23290255840 Configuration -          34166625743 Administrative Wu          43875070507 Concurrency +          53386400367Commit                        2          62723168908 Idle94          72000153315 Network *          8    1740759767Useri/O $          9    4108307767Systemi/O -         Ten2396326234 Scheduler7          One    3871361733Cluster -          A644977587 Queueing9

4. Several views related to:

V$session: Represents the beginning of a database activity, as a source.

V$session_wait: The view is used to record the waiting condition of the active SESSION in real time, which is the current information.

V$session_wait_history: is a simple enhancement to the v$session_wait, recording the last 10 waits of the active SESSION.

V$sqltext: When a database bottleneck occurs, it is usually possible to find the session that is waiting for resources from V$session_wait, through the SID of the session, Federated V$session and v$ The SQLText view captures the SQL statements that the session is executing.

V$active_session_history: Is the core of ash, used to record the history of the event SESSION waiting information, sampled once per second, this part of the content recorded in memory, the expectation is to record one hours of content.

Wrh#_active_session_history: Is the storage place of v$active_session_history in Awr.

V$active_session_history: The information in will be refreshed periodically (once per hour) into the load library and reserved for analysis by default for one weeks.

Dba_hist_active_sess_history: A view is a joint representation of a wrh#_active_session_history view and several other views, typically accessed through this view for historical data.

V$system_event: Because v$session records dynamic information that is related to the session's lifecycle and does not record historical information, Oracle provides a view v$system_event to record summary information for all waiting events since the database was started. With this view, users can quickly get a general overview of how the database is running.

Iv. Waiting Events

1. db file scattered read-db files distributed read

This situation usually shows the wait associated with the full table scan. When the database performs a full table sweep, the data is dispersed (scattered) into the buffer Cache based on performance considerations. If this wait event is significant, it may indicate that for some tables with full table scans, no indexes have been created, or no appropriate indexes have been created, we may need to check that these datasheets have determined if the settings are correct.

However, this wait event does not necessarily mean performance is low, under certain conditions Oracle will actively use full table scan to replace the index scan to improve performance, which is related to the amount of data accessed, Oracle under the CBO more intelligent choice, under the Rbo Oracle more inclined to use the index.

Because full table scanning is placed in the cold junction (cold End) of the LRU (Least recently used, least recently applicable) list, for frequently accessed smaller data tables, you can choose to cache them into memory to avoid repeated reads.

When this wait event is significant, it can be diagnosed in conjunction with the V$session_longops dynamic performance view, which records things that have been running for a long time (longer than 6 seconds), possibly many of which are full table scans (anyway, this is something that deserves our attention).

2. db file sequential read-db files are read sequentially.

This event typically shows read operations (such as index reads) related to a single block of data. If this wait event is significant, it may indicate that there is a problem with the table's connection order in a multi-table connection, that the driver table may not be used correctly, or that it may indicate that it is not indexed selectively.

In most cases we say that indexing makes it much quicker to get records, so it's quite normal for a coded, well-tuned database to wait. However, in many cases, using an index is not the best choice, such as reading large numbers of data in large tables, full table scanning may be significantly faster than index scanning, so in development we should be aware that such queries should avoid using index scanning.

3. Free buffer-Release buffer

This wait event indicates that the system is waiting for free space in memory, which indicates that there is no free memory space in the current buffer. If the application is well-designed, the SQL writing specification, fully binding variables, then this wait may indicate that the BUFFER Cache setting is small, you may need to increase db_buffer_cache.

The free Buffer wait may indicate that the DBWR is not writing fast enough, or there is serious competition in the disk, you may want to consider increasing checkpoints, using more DBWR processes, or increasing the number of physical disks, dispersing the load, and balancing IO.

4. Buffer busy-buffers Busy

The wait event indicates that a buffer is waiting for a unshareable to be used, or that it is currently being read into buffer cache. In general, buffer Busy wait should not be greater than 1%. Check the buffer Wait Statistics section (or v$waitstat) to see if the wait is in the segment header (Segment header). If so, consider increasing the free list (freelist, for oracle8i DMT) or adding freelist groups (in many cases this adjustment is immediate and before 8.1.6, this freelists parameter cannot be modified dynamically; In 8.1.6 and later versions, dynamically modifying the feelists requires setting compatible at least 8.1.6).

If this wait is in the undo header, you can resolve the buffer problem by adding a rollback segment (rollback segment). If the wait is on the undo block, we may need to examine the application to properly reduce the large-scale consistent reads, or reduce the density of data in a consistent read (consistent read) table or increase the db_cache_size.

If you are waiting for data block, consider moving the table or data that is frequently accessed concurrently to another block or a wider distribution (you can increase the Pctfree value, expand the data distribution, reduce competition), to avoid this "hot" data block, Or you might consider increasing the free list in the table or using a locally managed tablespace (locally Managed tablespaces).

If you wait in the index block, you should consider rebuilding the index, splitting the index, or using a reverse key index. To prevent buffer busy waits associated with blocks, you can also use smaller blocks: In this case, there are fewer records in a single block, so the block is not so "busy", or you can set a larger pctfree, which expands the physical distribution of the data and reduces the hot competition between records.

When executing DML (INSERT/UPDATE/DELETE), Oracle writes information to data blocks, and for multi-transaction concurrent access data tables, contention and waiting for ITL may occur, in order to reduce this wait, you can increase the Initrans and use multiple ITL slots. In Oracle9i, a new concept was introduced: ASSM (Segment Space Management Auto). With this new feature, Oracle uses bitmaps to manage space usage.

ASSM combined with LMT completely changed the storage mechanism of Oracle, bitmap freelist can alleviate buffer busy waiting (buffer busy wait), which was a serious problem in previous versions of Oracle9i.

Oracle claims that ASSM significantly improves the performance of DML concurrency operations because different parts of the bitmap (the same) can be used simultaneously, eliminating the ability to serialize the search for the remaining space. Based on Oracle's test results, using bitmap freelist will eliminate contention for all segmented headers (resources), as well as ultra-fast concurrent insertions. In Oracle9i, Buffer Busy Wait is no longer common!

5. Latch Free-latch Release

Latch is a low-level queueing mechanism for protecting shared memory structures in the SGA. Latch is like a memory lock that gets and releases quickly. Used to prevent shared memory structures from being accessed concurrently by multiple users. If latch is not available, the latch release failure is logged (latch free miss). There are two types of latch-Related:

    • Immediately
    • Can wait

If a process attempts to obtain a latch in immediate mode, and the latch is already held by another process, the process will not wait for the latch to be available if the latch cannot be used. It will continue to perform another operation.

Most latch issues are related to the following actions:

There is no good use of binding variables (library cache latch), redo generation issues (redo allocation latch), buffer storage contention (cache buffers LRU chain), and presence in buffer cache "hotspot "Block (cache buffers chain).

Often we say that if you want to design a failed system, regardless of binding variables, this condition is sufficient, for heterogeneous systems, the consequences of not using bound variables are extremely serious.

In addition, there are some latch waiting for the bug, should be concerned about the release of Metalink related bugs and patches. This problem should be studied when the latch miss ratios is greater than 0.5%.

Oracle's latch mechanism is competition, which handles similar csma/cd in the network, all user processes compete for latch, latch for the willing wait type (willing-to-wait), if a process does not get latch in the first attempt, Then it waits and tries again, if the latch is not available after _spin_count, then the process goes to sleep, lasts a specified length of time, wakes up again, and repeats the previous steps in order. The default value in 8i/9i is _spin_count= 2000.

If the SQL statement cannot be adjusted, in the 8.1.6 version, Oracle provides a new initialization parameter: Cursor_sharing can enforce binding on the server side by setting cursor_sharing = Force. Setting this parameter may bring some side effects, for Java programs, there are related bugs, the specific application should be concerned about the Metalink bug Bulletin.

6. Log Buffer space-

This wait occurs when you create a redo log on a log buffer that is faster than LGWR writes or if log switch is too slow. This wait appears, usually indicates that the redo log buffer is too small, in order to solve this problem, you can consider increasing the size of the log file, or increase the size of the log buffer.

Another possible cause is a bottleneck in disk I/O, and you might consider using a disk with a faster write speed. Settings under allowed conditions consider using a bare device to hold log files and improve write efficiency. In a general system, the lowest standard is not to put log files and data files together, because usually log files are read-only, separate storage can be achieved performance gains.

7. log file switch-logfile switch

When this wait occurs, all requests for commit (commit) need to wait for the "log file switchover" to complete.

The Log file Switch consists primarily of two child events:

Log file switch (archiving needed)

Log file switch (checkpoint incomplete)

Log file switch (archiving needed)

This wait event occurs when the first log archive is not completed and the wait occurs, usually because the log group loop is full. This wait may indicate a problem with IO. Workaround:

You can consider increasing the log file and increasing the log group

Move archive files to fast disk

Adjust the log_archive_max_processes.

Log file switch (checkpoint incomplete)-Logging switch (checkpoint not completed)

When your log group is finished, LGWR attempts to write the first log file, and if the database does not finish writing the Dirty block recorded in the first log file (for example, the first checkpoint is not completed), the Wait event appears.

This wait event usually indicates that your DBWR is writing too slowly or that there is an IO problem.

To solve this problem, you may want to consider adding additional DBWR or increasing your log group or log file size.

8. log file sync-logfile synchronization

When a user commits or rolls back data, LGWR writes the session's redo to the redo log by the log buffer. The log file synchronization process must wait for this process to complete successfully. To reduce this wait event, you can try to commit more records at once (frequent commits can lead to more overhead). Place the redo logs on a faster disk, or alternately use the redo logs on different physical disks to reduce the impact of archiving on LGWR.

In the case of soft raid, generally do not use RAID 5,RAID5 for frequent write system will bring a large performance loss, you can consider the use of file system direct input/output, or use bare devices (raw device), so that the performance of writing can be improved.

9. log file single write the event is related only to the header block of the Write log file, which typically occurs when adding new group members and promoting serial numbers.

The head block is written individually, because part of the header information is the file number, each file is different. Update log file header This operation is done in the background, generally rarely waiting, without much attention.

Ten. log file parallel write

Writes redo logs from log buffer to the redo log file, mainly referring to regular write operations (relative to log file sync). If your log group has more than one group member, when flush log buffer, the write operation is parallel, and this wait event may occur.

Although this write operation is processed in parallel, the write operation will not be completed until all I/O operations are completed (if your disk supports asynchronous IO or uses IO SLAVE, this wait may occur even if there is only one redo log file member).

This parameter, compared to the log file sync time, can be used to measure the write cost of log file. This is often referred to as the synchronization cost rate.

Control file Parallel write--parallel write

This event may occur when the server process updates all the control files. If the wait is very short, you can not consider it. If the wait time is longer, check the physical disk I/O that holds the control file for bottlenecks.

Multiple control files are identical copies that are used for mirroring to improve security. For a business system, multiple control files should be stored on different disks, generally three are sufficient, if only two physical hard disks, then two control files are also acceptable. It is not practical to save multiple control files on the same disk. To reduce this wait, consider the following methods:

Reduce the number of control files (on the premise of ensuring safety)

If supported by the system, use asynchronous IO

Transfer control files to IO-burdened physical disk

Control file sequential Read/control file single write

Control file continuous read/control file a single write to a single control file I/O has problems when both events occur. If the wait is obvious, check the individual control file to see if there is an I/O bottleneck in the storage location.

Direct path write-directly to write the wait occurs when the system waits to confirm that all outstanding asynchronous I/O have been written to disk.

For this write wait, we should find the data file with the most frequent I/O operations (if there are too many sort operations, most likely a temporary file), spread the load and speed up its write operations.

If there is too much disk ordering on the system that can cause temporary table space operations to occur frequently, consider using the local management table space, dividing into smaller files, writing to different disks, or to bare devices.

. Idle event-Idle Event

Finally we look at a few idle wait events. Generally speaking, idle waiting means that the system waits for nothing to do, or waits for a user's request or response, and usually we can ignore these wait events. Idle events can be queried through the Stats$idle_event table.

Let's take a look at the main idle wait events of the system, and you should have a rough idea of what these events are, if your top 5 wait events are mostly those events, then generally speaking your system is a little bit less.

Reference: rulev5-http://blog.csdn.net/rulev5/article/details/7075401

Dba_oracle event await analysis (concept)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.