"Reprint" Some scenes of Tx-row lock contention

Source: Internet
Author: User

Some scenes of Tx-row lock contentionoriginal 2016-07-11 Ishin and Emme


Estevan (Eson)

Cloud and ink technology experts

This article is organized from the July 7 Thursday night Cloud and the grace of the great auditorium guests easy to share the theme: Tx-row lock contention Some scenes for your reference.

Overview

During database operations, Enq:tx-row lock contention is a common wait event, especially in a RAC environment. For Enq:tx-row lock contention wait events, Oracle classifies them as application type waits, most scenarios are due to unreasonable application logic design, the application and release of Txenqueue Lock cross-competition, affecting the concurrency of transactions, resulting Inefficient application processing.

Note: This document mainly discusses Enq:tx-row lock contention wait events, TX Enqueue also includes index contention, and ITL contention,other contention is caused by other reasons.

Check Method

1. Refer to the SQL statement:

Note: The request>0 in the subquery is the session with the lock request, and by ID1, ID2, type Association Gv$lock, the blocking person and the blocked person are all listed;

Note: The Blocking_session field is the session ID of the blocked person, which can be combined with dbms_rowid.rowid_create (1, (select Data_object_idfrom dba_objects where object_ id=row_wait_obj#), row_wait_file#,row_wait_block#, row_wait_row#) gets the ROWID that the lock competition occurs.

2. Hanganalyze

Interval execution several times oradebug-g all Hanganalyze 3 way, can quickly find the blocking session;

3. ASH and AWR

Enq:tx-row lock contention Wait is at the session level, more suitable for analysis using ASH, you can see the corresponding SQL, waiting for the event of the P1, P2, blockers and other information (the blocker may not be able to see, because it is ACTIVE for a short time, or usually do not is an ACTIVE session); The AWR report is a report of the overall load on the database that can be assisted with analysis, focusing on the average wait time for waiting events, the Segmentorder by row wait section, and the Enqueue statistics section.

Some things to note:

1) in a RAC environment, if you want to kill a blocking session, you need to accurately identify the instance, otherwise you may be manslaughter other important processes

2) When a query occurs when a lock contention SQL statement or a specific row, the query is usually executed by the blocked SQL statement or waiting line, because the blocking, uncommitted transactions, can not be in the INACTIVE state of the action, if it is the active state, but also to execute other SQL statements, The SQL that really triggers the lock race may be cached by the extruded library, and the SQL statement needs to dig through a tool like LOGMNR.

Test Cases

In MOS article 62354.1, the Enq:tx-row lock contention wait event is summarized, and for a single transaction, the wait event occurs, and the lock's request pattern mode is typically 4 (share) and 6 (exclusive):

1. The pattern is 4 (share), which is usually a violation of the transaction constraint level, such as

1) There is a primary key unique key, session 1 insert data has not completed commit, Session 2 Insert the same value, at this time into the queue waiting;

Session state:

2) The existence of the main external constraints (the main building can exist on 2 tables, can also be 1 table self-referencing foreign keys), session 1 Insert Parent key is not committed, session 2 begins to insert the relevant foreign key, session 2 will enter the queue waiting.



Session state:

2. Mode 6 (exclusive), usually for session 1 on a line to perform UPDATE/DELETE uncommitted, session 2 on the same row of data update/delete, or other reasons caused by slow lock release, will cause subsequent sessions into the queue waiting.



Session state:

Summary: From the above test process can be seen, the request pattern is 4, usually occurs at the transaction level, the object ID (row_wait_obj#) is usually 1, and the request mode is 6, the row level, the object ID (row_wait_obj#) corresponds to dba_objects in the object_id.

Specific Scenarios

In most cases, enq:tx-rowlock contention waits can only be adjusted from the application level to optimize the application logic and improve the concurrency of the transaction. However, there are exceptions, in individual scenarios, due to the IO, network and other causes of SQL inefficiency caused by the Enq:tx-row lock contention can be optimized from the database level.

1. Network problems

Phenomenon:

A hospital admission registration system (c/S structure), in the registration of ABC patients admitted to the hospital, the application tamper registration failed to change a few machines to try, but the registration of other patients is not a problem;

Analysis:

Problem time, not the peak of admission registration, check the database found that when the application registration compaction, the database will appear in the relevant session into the Enqueue waiting queue, the Enq:tx-row lock contention wait for events, query the database related views, v$lock The blocked session request lock mode on display is 4. After understanding, the day due to network instability, around 17:00 application problems, has been abnormally closed, but the process on the database is still in, check the V$transaction blocking session 3322 transaction start time can coincide with 17:00 or so:

Cause of the problem:

Admission registration of each user, identity card is the unique key, just beginning to register the ABC patient, the table has been inserted a data, the program crashes, the database server user process is not properly destroyed, leading to re-registration of the ABC patient, into the queue waiting;

Processing method:

After understanding the cause of the problem, kill the blocking session, and then successfully register the patient information that was previously failed to enroll on the application.


2. Implementation plan issues

Phenomenon:

A province electric power company production database, often appear more enq:tx-row lock contention wait event, application reflects the business operation is very slow, seriously affect business processing;

Analysis:

Log in to the database for inspection, V$lock shows that there are multiple blocking sessions (different id1 and ID2 values, there are multiple rows of blocking), blocked session request lock mode requests is 6, the blocking will change, but slow.

Further checks, there are multiple blocking sessions, and are not blocked by another session, the wait event is read by other session, the SQL statement executed is

Execution Plan:


The execution plan shows that table B is accessed by the index unique scan, a table is a full table scan, where B is a small table, a table is relatively large with several GB size, a table ResourceID selectivity is very good, but there is no index, SQL single response time reached hundreds of seconds.

Cause of the problem:

Because the A-table larger Oracle is not fully cached in the buffer cache, each full-table scan will have to read some blocks from disk, peak business hours, execute the SQL statement concurrently, produce multiple full table scans, and the IO overhead is huge, resulting in a read by other session waiting , because of inefficient SQL execution, the TX lock is released slowly, causing the subsequent session to enter the queue waiting;

Processing method:

Optimize SQL statements, create indexes on table resource_id, improve SQL efficiency, Enq:tx-row lock contention wait for events to disappear.

3. Application issues

Phenomenon:

A manufacturing customer, there are a large number of Enq:tx-row lock contention waiting events in the database, the customer reflects that the session with lock blocking has been close to thousands.

Observe the statistics of the wait events in the ASH report (focus on the number of active sessions):

Observe the AWR comparison report before and after the system anomaly, waiting for the event to change (focus on average wait time and DB time ratio):

Analysis:

The Telnet database is checked, the lock request mode is 6, and the blocker is constantly changing, checking the executed SQL statement:

The execution plan for SQL is index range Scan, index is a composite index, there is no efficiency problem from the execution plan, the execution frequency, execution plan, single response time, logical reading and other execution statistics of the last few hours are observed by Wrh$sql_stat.


The discovery that the plan_hash_value unchanged means that the execution plan has not changed, but the SQL single response time has undergone an order of magnitude changes, from hundred seconds to the second level, logical reading has risen, time spent on CPUTime less, more happened to the application type wait and Cluster type wait on. In addition, ROWS_PROC is processing the number of rows, gradually upward trend, it appears that the application side has been blocked, each update of the number of rows has become larger.

Cause of the problem:

Although there is no application code, it should be the application logic problem from the data. Customers are advised to check from the application level, found that there is an application server, approximate crazy initiating requests, concurrent requests too high, the application level processing is not over; The database layer is distributed to the database RAC two nodes because of the load balance, and the more obvious the cluster competition is, the more obvious the DML operation is affected. When session 1 begins a transactional operation (uncommitted), Session 2 begins a transaction operation, and if a row lock contention occurs, the transaction is not started until session 1 transaction is completed, using undo to construct consistency. If the two sessions are on different nodes, you must wait for the redo to be brushed from the buffer to the disk to transfer the relevant data blocks to the other nodes:

So the main reason is the application request exception, data accumulation, a single processing data, and secondly, coupled with the competition between the RAC group, resulting in Enqueue lock acquisition and release time elongated, there is a conversation queuing phenomenon.

Processing method:

After the problem is clarified, the customer adjusts the application server and processes the function module that initiated the exception request, and the database Enq:tx-row lock contention waits for the event to gradually decrease until it disappears.

4. Other issues

1) Error in establishing bitmap index on DML frequent tables;

2) The use of forupdate is not reasonable, resulting in the scope of the for update is too large;

For example, after a multi-table association forupdate:


For update without a where restriction:

Summary

In summary, in order to minimize the occurrence of Tx-row lock contention wait event in some transactions, high concurrency, it should be considered from the application design to the database at various levels.

Application level:

1, the constraint is usually to ensure the data integrity, in the concurrency scenario, should fully consider the logical order of transactions, to avoid the cross-cutting of multiple session transactions, triggering constraint conflicts at the transaction level of competition;

2, in order to improve the concurrency efficiency, we should split large transactions as much as possible to improve the speed of acquisition and release of TX enqueue lock;

3, if you want to use pessimistic lock (for update), should minimize the range of locked lines;

Database level:

1. Establish appropriate indexes on DML frequent tables, improve the efficiency of SQL execution, reduce the time of TX enqueue Lock holding, avoid full table scan, it is easy to cause IO overhead, hot block competition, session accumulation access mode.

2. Bitmap indexing should not be used on a table with frequent DML;

3, for DML frequent tables, should not use IOT tables, materialized views, etc.;

4, RAC environment, for batch DML operations, as far as possible fixed in a single node, to minimize network overhead, cluster competition, consistency block acquisition and log brush disk impact.

Eygle last add two points:

1. Estevan used the AWR comparison report in the shared content, the report is very practical, if you have not used, you can seriously study. _home/rdbms/admin/awrddrpt.sql can be generated with $ORACLE.

2. In the RAC environment, the 3rd case mentioned by Estevan, due to the complexity of lock competition will be very magnified. The Cache Fusion mechanism is more complex. I share a very wonderful document to everyone, read this document, you will have a better understanding of RAC and LOCK. One of the diagrams referenced by Estevan is from this document.

Understanding Oracle RAC Internals-the Cache Fusion Edition, this document is very exciting, Markus is the product manager of RAC, we must have time to read it carefully: HTTP://PAN.BAIDU.C Om/s/1i4sw4xr

"Reprint" Some scenes of Tx-row lock contention

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.