There is a serious performance problem with a customer's database, and according to AWR's report, system performance issues are related to rollback contention.
Normally, the AWR db time information for the client database is:
elapsed:119.92 (mins)
DB time:22.99 (mins)
When the problem occurs, DB time information becomes:
elapsed:120.07 (mins)
DB time:37,447.52 (mins)
There are 32 CPUs in the database server, and you can see that during the sampling period, the 32 CPUs are almost all in the working state of 100%.
Top 5 Timed Events
Event Waits Time (s) Avg. (MS)% total call time wait Class
Enq:us–contention 1,995,867 943,404 473 42.0 Other
Row cache lock 568,341 699,241 1,230 31.1 concurrency
GC Buffer busy 389,944 227,279 583 10.1 Cluster
Enq:tx-index contention 393,340 171,647 436 7.6
Buffer busy waits 186,159 107,135 576 4.8
Observe the top 5 wait events and find that most of the wait occurs on enq:us–contention and row cache lock. Judging from this information, the database may have encountered bug:7291739.
According to the description of the bug on Metalink, this bug will appear a lot of enq:us–contention wait, and still appear latch:row cache objects waiting. On the dc_rollback_segments, there will be a more serious latch lock.
Check the normal time AWR report dc_rollback_segments statistics:
Cache get Requests pct Miss Scan reqs pct Miss Mod reqs Final Usage
Dc_rollback_segments 185,406 0.00 0 0 3,615
For question times, dc_rollback_segments statistics are:
Cache get Requests pct Miss Scan reqs pct Miss Mod reqs Final Usage
Dc_rollback_segments 4,805,587 0.01 0 3,073 3,613
Obviously, the dc_rollback_segments of the problem moment is about 50 times times the normal time.
On the other hand, because of the problem time before the system appeared in the Long-running SQL statement, is the system rollback contention in a significant increase:
Undo Segment Stats
End time Num Undo Blocks number of transactions Max Qry Len (s) Max Tx concy
05-may 18:08 7,608 45,560 301 1,748
05-may 17:58 5,187 24,909 0 1,364
05-may 17:48 1,229 7,471 0 307
05-may 17:38 2,942 16,753 0 1,002
05-may 17:28 1,119 5,293 0 382
05-may 17:18 2,446 6,925 898 502
05-may 17:08 2,137 8,464 349 273
05-may 16:58 2,874 27,562 0 6
05-may 16:48 2,625 25,278 0 7
05-may 16:38 2,496 23,711 1,006 8
05-may 16:28 2,194 21,037 404 6
05-may 16:18 1,877 17,981 0 5
05-may 16:08 1,883 17,215 0 5
The only doubt is that the problem is fixed in 10.2.0.4.4, 10.2.0.5, 11.2.0.1, and the current database patch is 10.2.0.4.7, so the current problem is that the bug is still in doubt.
In addition to bug:7291739, Oracle Bug 8268775 is also a relatively large possibility. This database is indeed a RAC environment, and a large number of program sessions are connected to instances during the occurrence of a bug.
If you run into this bug, it's unlikely that you'll be able to fix the bug in 10.2, or at least upgrade to 11g to solve the problem.
Author: 51cto Blog Oracle Little Bastard
Back to the column page: http://www.bianceng.cnhttp://www.bianceng.cn/database/Oracle/