I. troubleshooting Process
Database Version: oracle10.2.0.5.4
Operating System: redhat5.4
At noon, my colleagues added fields to the table, and then caused more than 70 objects such as storage to be invalid. The system could not run normally. I checked the table where my colleagues added fields. No lock was released.
An invalid object can be compiled, but cannot be compiled. The object remains in the compiling state. After other sessions are killed, the database cannot be compiled because it affects the business and decides to restart the database.
-- Disable DB:
[Oracle @ qs-xezf-db1 ~] $ Sqlplus/As sysdba
SQL * Plus: Release 10.2.0.5.0-productionon Thu Mar 29 13:23:14 2012
Copyright (c) 1982,201 0, Oracle. All rights reserved.
Connected:
Oracle Database 10g enterprise editionrelease 10.2.0.5.0-64bit Production
With the partitioning, OLAP, data miningand real application testing options
Sys @ xezf (qs-xezf-db1)> alter systemswitch logfile;
System altered.
Sys @ xezf (qs-xezf-db1)> alter systemswitch logfile;
System altered.
Sys @ xezf (qs-xezf-db1)> alter systemswitch logfile;
System altered.
-- In my habits, switch logs before shutting down the database.
Sys @ xezf (qs-xezf-db1)> shutdownimmediate
-- Always hangs on this
-- Trace log:
[Oracle @ qs-xezf-db1 bdump] $ tail-falert_xezf.log
Stopping background process qmnc
Thu Mar 29 13:24:08 CST 2012
Stopping background process cjq0
Thu Mar 29 13:24:10 CST 2012
Stopping background process mmnl
Thu Mar 29 13:24:11 CST 2012
Stopping background process mmon
License high water mark = 201
Thu Mar 29 13:24:12 CST 2012
Job Queue slave processes stopped
Thu Mar 29 13:29:11 CST 2012
Active call for process 11403 user 'oracle 'program 'oracle @ qs-xezf-db1'
System state dumped to trace file/u01/APP/Oracle/admin/xezf/udump/xezf_ora_13569.trc
Thu Mar 29 13:29:29 CST 2012
Shutdown: Waiting for active CILS ToComplete.
Thu Mar 29 13:44:15 CST 2012
Mmnl absent for 1207 secs; foregroundstaking over
-- Wait for 20 minutes, and then wait for 10 minutes.
Decide to forcibly close the database:
[Oracle @ qs-xezf-db1 ~] $ Sqlplus/As sysdba
SQL * Plus: Release 10.2.0.5.0-productionon Thu Mar 29 13:53:13 2012
Copyright (c) 1982,201 0, Oracle. All rights reserved.
Connected.
SQL> shutdown abort
Oracle instance shut down.
-- Start DB:
SQL> startup
Oracle instance started.
Total system global area 5251268608 bytes
Fixed size 2103600 bytes
Variable Size 1275070160 bytes
Database buffers 3959422976 bytes
Redo buffers 14671872 bytes
Database mounted.
Database opened.
SQL>
-- When an invalid object is compiled, the system returns to normal.
Ii. analysis process 2.1 analyze system state dump
When the DB is disabled, a system dump file is automatically generated and analyzed using ass. awk:
Oracle uses the ass. awk tool to view system state dump instructions
Http://blog.csdn.net/tianlesoftware/article/details/7237729
[Oracle @ qs-xezf-db1 SRC] $ awk-F ass109.awk/u01/APP/Oracle/admin/xezf/udump/xezf_ora_13569.trc
Starting systemstate 1
........................................ .....
Ass. awk version 1.0.9-processing/u01/APP/Oracle/admin/xezf/udump/xezf_ora_13569.trc
System state 1
~~~~~~~~~~~~~~~~
1:
2: Waiting For 'pmon timer'
3: Waiting For 'rdbms IPC message'
4: Waiting For 'rdbms IPC message'
5: Waiting For 'rdbms IPC message'
6: Waiting For 'rdbms IPC message'
7: Waiting For 'rdbms IPC message'
8: Last wait for 'rdbms IPC message'
9: Last wait for 'smon timer'
10: Waiting For 'rdbms IPC message'
14:
15:
17: Waiting For 'rdbms IPC message'
18: Waiting For 'rdbms IPC message'
23: Waiting For 'SQL * Net message fromclient'
CMD: PL/SQL Execute
26: Last wait for 'ksdxexeotherwait'
38: Waiting For 'SQL * Net message fromclient'
CMD: PL/SQL Execute
39: Waiting For 'SQL * Net message fromclient'
40: Waiting For 'SQL * Net message fromclient'
41: Waiting For 'SQL * Net message fromclient'
75: Waiting For 'SQL * Net message fromclient'
88: Waiting For 'SQL * Net message fromclient'
CMD: PL/SQL Execute
95: Waiting For 'SQL * Net message fromclient'
96: Waiting For 'SQL * Net message fromclient'
97: Waiting For 'SQL * Net message fromclient'
99: Waiting For 'SQL * Net message fromclient'
104: Waiting For 'SQL * Net message fromclient'
CMD: PL/SQL Execute
106: Waiting For 'SQL * Net message fromclient'
107: Waiting For 'SQL * Net message fromclient'
108: Waiting For 'SQL * Net message fromclient'
117: Waiting For 'SQL * Net message fromclient'
CMD: Update
129: Waiting For 'SQL * Net message fromclient'
CMD: Update
130: Waiting For 'SQL * Net message fromclient'
133: Waiting For 'SQL * Net message fromclient'
138: Waiting For 'SQL * Net message fromclient'
140: Last wait for 'db file sequential read' (1, 974b, 1)
CMD: insert
154: Waiting For 'SQL * Net message fromclient'
CMD: PL/SQL Execute
159: Waiting For 'SQL * Net message fromclient'
160: Waiting For 'SQL * Net message fromclient'
162: Waiting For 'SQL * Net message fromclient'
171: Waiting For 'SQL * Net message fromclient'
196: Waiting For 'SQL * Net message fromclient'
204: Waiting For 'SQL * Net message fromclient'
CMD: PL/SQL Execute
216: Waiting For 'SQL * Net message fromclient'
CMD: PL/SQL Execute
220: Waiting For 'SQL * Net message from client' [latchfor]
Blockers
~~~~~~~~
Above is a list of all the processes. If they are waiting for a resource
Then it will be given in square brackets. Below is a summary of
Waited upon resources, together with the holder of that resource.
Notes:
~~~~~
O A process ID '??? 'Implies that the holder was not found in
Systemstate.
Resource holder state
Latch ??? Blocker
Object names
~~~~~~~~~~~~
Latch for shutdown: Waiting foractive callto complete.
4767 lines processed.
[Oracle @ qs-xezf-db1 SRC] $
2.2 analysis: shutdown: Waiting for active calltocomplete.
During the shutdown process, this step was stopped for nearly 15 minutes.
Thu Mar 29 13:29:11 CST 2012
Active call for process 11403 user 'oracle 'program 'oracle @ qs-xezf-db1'
System state dumped to trace file/u01/APP/Oracle/admin/xezf/udump/xezf_ora_13569.trc
Thu Mar 29 13:29:29 CST 2012
Shutdown: Waiting for active CILS ToComplete.
Thu Mar 29 13:44:15 CST 2012
Mmnl absent for 1207 secs; foregroundstaking over
MOS is explained as follows:
Alert Log: shutdown waiting for activecallto complete [ID 1039389.6]
2.2.1 Problem Description
You areattempting to shut down the database and the database hangs.
Alert Log contains the following message:
-- When we try to close the database, the database hang is occupied, and then the Alert Log will display the following information:
Shutdown: Waiting for active cballs to complete
There are noother error messages in the Alert Log.
There are no other error messages.
2.2.2 solution Description: -- Solution description
--------------------------------------------------------
Caveat: this note is intended for UNIX platforms primarily, but we shouldaddress this issue on Windows platforms as well. On Windows, stopping the service is a validworkaround for these errors.
--------------------------------------------------------
Warning this method is applicable to UNIX platforms, but Windows platforms also apply.
Locate and kill any client connections tothe database at the Unix level, as follows:
The solution is to find and kill any client connection at the UNIX system level. The specific method is as follows:
1. Locate any client connections to thedatabase using ps, and grep for any processes belonging to this.
Example: PS-Ef | grep Sid
2. Look for processes that include a 'local = no' designation.
Example: osupport 6235 1 0 Nov 24 oraclesid (local = No)
For more information, see:
Local = No and local = yes in the Oracle server process
Http://blog.csdn.net/tianlesoftware/article/details/5675404
3. Kill the Unix process (es) with the 'Local = no' designation.
-- Kill the process with local = No
Example: Kill-9 6235
2.2.3 explanation-causes of this phenomenon
The database iswaiting for pmon to clean up processes, but pmon is unable to clean them. theclient connections to the server are causing the shutdown immediate or normalto hang. killing them allows pmon toclean up and release the associated Oracle processes and resources.
-- The database waits for the pmon process to clean up the local = No processes, but the pmon process cannot clean them. These processes will cause the database to shut down Hang and kill these processes, then let pmon clean up and release other database processes and resources, and complete the database shutdown operation.
For more information about the pmon process, refer:
Oracle background process description
Http://blog.csdn.net/tianlesoftware/article/details/5587788
What resources are we talking about?
-- The resources that make up the dB to close include:
1) Any non committed transactions must berolled back
2) any temporary space (sort segments/lobs/session temporary tables) must be freed
3) The session itself and any associatedmemory consumed by the session.
4) Internal Locks/enqueues must becleaned up
Often Oracle (SMON or pmon depending on whether shared server is used) will wait for the osto terminate the process (es) associated with the session. I the OS never returns, or fails to terminatethem, then the instance shutdown will hang with this message (shutdown waitingfor active callto complete)
Other meansexist to achieve a quick shutdown, as outlined in note=408.1-what is the fastest way to cleanly shutdown an Oracle database?
Note that ine-business environments, the same messages can be produced if the databaseshudown process starts before the concurrent manager stops in the applicationserver.
2.3 analysis: mmnl absent for 1207 secs; foregroundstaking over
In this fault I encountered, I waited for 10 minutes and got stuck here. Then I shutdown abort dB.
Mos description:
"Mmnl absent for % u secs; foregroundstaking over" messages in alert. log [ID 465891.1]
When there is ahang situation for a long period of time, Oracle writes this message tothe alert. Log: "mmnl absent for % u secs; foregrounds takingover ".
-- When the database hang remains for a while, Oracle will write this information to the Alert Log.
Thu Nov 1 15:20:56 2007
Mmnl absent for 3805 secs; foregrounds taking over
Thu Nov 1 15:58:13 2007
Starting Oracle instance (normal)
If this is ahang situation during Shutdown (after mmnl is stopped) or during databaseopen (when mmnl is not yet started), messages are written to the alert. logevery 5 minutes.
-- When the database is closed (mmnl has been stopped) or when the database is open (mmnl has not been started), the message is written to the alert log every five minutes.
Any completehang situations which take a long time trigger these messages.
2.3.1 cause-cause
These messagesare for informational purposes only to indicate a hang situation which lasted along period of time and mmnl is unable to run its job at that moment. when thehang lasts for several hours before the instance is completely stopped/killed, the tail of the alert. log may flood with these messages. although the messagetext contains mmnl process, in this case mmnl is actually the victim of thehang issue rather than the cause.
-- This information is only used to indicate that the database has been hang and mmnl cannot execute its job.
2.3.2 solution -- Solution
In case of ahang taking a very long time, there is no way to prevent these messages to bewritten to alert. log. hang situation must be resolved to stop the messages. starting with 11.1 version, messages are written to a trace file rather thanthe alert. log to prevent messages to be written to alert. log repeatedly manytimes.
-- In some cases, hang may take a long time and there is no way to prevent the database from writing this information into the Alert Log. From Oracle 11gr1, the message will be written to tracefile instead of Alert Log. This prevents repeated writes to the Alert Log.
Iii. Summary
In my fault this time, there are two problems:
3.1 adding a field guide will invalidate the object
Because the system is running and the modified object is also the core table, many problems are caused. In the final analysis, this is a normative problem. In order to minimize the time needed, I restarted the database, but the database hANG was stopped during the restart process.
Here, the optional operations include hanganalyze, system dump, check the lock, wait condition:
Oracleoradebug command instructions
Http://blog.csdn.net/tianlesoftware/article/details/6525628
Oraclehanganalyze function to diagnose dbhanging
Http://blog.csdn.net/tianlesoftware/article/details/6321961
Oracle lock table processing note
Http://blog.csdn.net/tianlesoftware/article/details/6679014
Oracle view the lock held by the object
Http://blog.csdn.net/tianlesoftware/article/details/6822321
Hang during the 3.2 dB Shutdown Process
The solution I used in the operation is shutdown abort. The solution provided on MOS is to kill the process with local = no at the operating system level, these processes cannot be cleaned up by the pmon process, causing hang in the shutdown process.
Bytes -------------------------------------------------------------------------------------------------------
All rights reserved. reprinted articles are allowed, but source addresses must be indicated by links. Otherwise, the documents will be held legally responsible!
Skype: tianlesoftware
Email: tianlesoftware@gmail.com
Blog: http://www.tianlesoftware.com
WEAVER: http://weibo.com/tianlesoftware
Twitter: http://twitter.com/tianlesoftware
Facebook: http://www.facebook.com/tianlesoftware
LinkedIn: http://cn.linkedin.com/in/tianlesoftware
------- Add a group to describe the relationship between Oracle tablespace and data files in the remarks section. Otherwise, reject the application ----
Dba1 group: 62697716 (full); dba2 group: 62697977 (full) dba3 group: 62697850 (full)
Super DBA group: 63306533 (full); dba4 group: 83829929 dba5 group: 142216823
Dba6 group: 158654907 dba7 group: 172855474 DBA group: 104207940