Analysis of database downtime caused by ORA-04031
Background
On February 5, the user's database went down unexpectedly. After restarting the server, the Oracle database went back to normal. The user hoped to troubleshoot the problem and avoid another downtime accident, troubleshooting of such unexpected downtime is a common case of remote processing. Although there are many causes of downtime, the troubleshooting steps are basically the same and there is a fixed routine, next we will introduce how to locate the problem step by step.
Analysis Steps
Step 1. query alert logs and find error messages
This case is post-event analysis. Only one method for this analysis is to view logs. If it is an operating system, you need to view the logs of the operating system, it is the log of the database. We know that the database has a lot of logs, but the most critical log is the alert Log. If it is a RAC environment, it may involve CRS logs. Here it is a standalone environment, we only need to pay attention to alert logs. The path and log name of alert logs do not need to be described too much. As a DBA, this common sense should be available, when you get the alert Log, you first check the abnormal records before and after the database downtime. In this case, the following error message is displayed before the database downtime:
Errors infile d: \ oracle \ product \ 10.2.0 \ admin \ orump \ bdump \ oraxy_cjq0_4340.trc:
ORA-00604: recursive SQL Level 1 error
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool", "select job, nvl2 (last_date,...", "sqlarea", "tmp ")
Wed Jun04 18:59:17 2014
Errors infile d: \ oracle \ product \ 10.2.0 \ admin \ orump \ bdump \ oraxy_cjq0_4340.trc:
ORA-00604: recursive SQL Level 1 error
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool", "select count (*) from sys. job...", "sqlarea", "tmp ")
Wed Jun04 18:59:22 2014
Errors infile d: \ oracle \ product \ 10.2.0 \ admin \ orump \ bdump \ oraxy_cjq0_4340.trc:
ORA-00604: recursive SQL Level 1 error
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool", "select job, nvl2 (last_date,...", "sqlarea", "tmp ")
Wed Jun04 18:59:22 2014
Errors infile d: \ oracle \ product \ 10.2.0 \ admin \ orump \ bdump \ oraxy_cjq0_4340.trc:
ORA-00604: recursive SQL Level 1 error
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool", "select count (*) from sys. job...", "sqlarea", "tmp ")
Wed Jun04 18:59:27 2014
Errors infile d: \ oracle \ product \ 10.2.0 \ admin \ orump \ bdump \ oraxy_cjq0_4340.trc:
ORA-00604: recursive SQL Level 1 error
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool", "select job, nvl2 (last_date,...", "sqlarea", "tmp ")
Through the log information, we can initially locate the cause of this fault is caused by ora-00604 and ora-04031, in order to further accurate positioning, we should view the subtrace file in the prompt, as shown above d: \ oracle \ product \ 10.2.0 \ admin \ orlogs \ bdump \ oraxy_cjq0_4340.trc, which must contain more details.
Installing Oracle 12C in Linux-6-64
Install Oracle 11gR2 (x64) in CentOS 6.4)
Steps for installing Oracle 11gR2 in vmwarevm
Install Oracle 11g XE R2 In Debian
Step 2. view the tracking log to find more detailed information
The tracking log query requires some experience and sensitivity to the problem. The log Content format varies depending on the tracing object. The following is the key information for this tracking:
SO: 000007FF493D38A0, type: 4, owner: 000007FF49005208, flag: INIT/-/0x00
(Session) sid: 543 trans: 0000000000000000, creator: 000007FF49005208, flag: (51) USR/-BSY /-/-/-/-/-
DID: 0001-0016-00000003, short-termDID: 0000-0000-00000000
Txn branch: 0000000000000000
Oct: 0, prv: 0, SQL: 0000000000000000, psql: 0000000000000000, user: 0/SYS
Last wait for 'sga: allocation forcing component growth 'blocking sess = 0x0000000000000000 seq = 30782wait_time = 15629 seconds since wait started = 0
= 0, = 0, = 0
Dumping Session Wait History
For 'sga: allocation forcing component growth 'count = 1 wait_time = 15629
= 0, = 0, = 0
For 'sga: allocation forcing componentgrowth 'count = 1 wait_time = 15006
= 0, = 0, = 0
For 'latch: shared pool 'count = 1 wait_time = 624
Address = c96aed8, number = d6, tries = 1
For 'latch: shared pool 'count = 1wait_time = 1214
Address = c96aed8, number = d6, tries = 0
For 'latch: library cache' count = 1wait_time = 77
Address = 324ef0f0, number = d7, tries = 0
For 'latch: shared pool 'count = 1wait_time = 1369765
Address = c96aed8, number = d6, tries = 0
For 'rdbms ipc message' count = 1wait_time = 5007402
Timeout = 1f4, = 0, = 0
For 'rdbms ipc message' count = 1wait_time = 5006909
Timeout = 1f4, = 0, = 0
For 'rdbms ipc message' count = 1wait_time = 5007270
Timeout = 1f4, = 0, = 0
For 'rdbms ipc message' count = 1wait_time = 5004478
Timeout = 1f4, = 0, = 0
Temporary object counter: 0
----------------------------------------
UOL used: 0 locks (used = 1, free = 4)
KGXAtomic Operation Log 000007FF35B23660
Mutex 0000000000000000 (0, 0) idn 0 success NONE
Cursor Parent uid 543 efd 10 whr 4 slp 0
Authorization = NONE pt1 = 000007FF2DD5ECA8pt2 = 000007FF2DD5ED10 pt3 = 000007FF2DD5F230
Pt4 = 0000000000000000 u41 = 2 stt = 0
KGXAtomic Operation Log 000007FF35B236A8
Mutex 000007FF2A744D18 (0, 12) idn 0 limit NONE
Cursor Stat uid 543 efd 11 whr 2 slp 0
Authorization = NONE pt1 = 000007FF2A744BE8pt2 = 0000000000000000 pt3 = 0000000000000000
Pt4 = 0000000000000000 u41 = 0 stt = 0
KGXAtomic Operation Log 000007FF35B236F0
Mutex 0000000000000000 (0, 0) idn 0 success NONE
Library Cache uid 543 efd 0 whr 0 slp 0
For more details, please continue to read the highlights on the next page: