Killing an Oracle process from inside Oracle
I had a following situation few days Ago–i is running a CREATE TABLE as SELECT over a heterogenous services Dblink. However I cancelled this command via pressing CTRL + C twice in Windows Sqlplus (this actually just kills the client Sqlplus The call).
Anyway, when I wanted to drop the table involved, this happened:
sql> drop table Mytable;drop table MYTABLE *error at line 1:ora-00054:resource busy and acquire with NOWAIT Specif Ied
I can ' t drop a table as someone is holding a lock on it. Fair enough, this is a dev environment used only by me, so I used dba_objects. OBJECT_ID to find out the object ID of this table:
sql> @o mytableowner object_name object_type CREATED last_ddl_time status OID d_ OID------------------------------------------------------------------------------------------------------------ ----------------------------xyz_dev01_owner MYTABLE TABLE 20080616 11:08:44 20080616 11:08:44 VALID 63764 63764
... and then I queried what Enqueue locks were held on the that object:
Sql> SELECT * from V$lock where id1=63764; ADDR kaddr SID TY ID1 ID2 lmode REQUEST CTIME BLOCK---------------------------- ------------------------------------------------------------40034278 40034290- TM 63764 0 6 0 2662 0
Ok, I see session of holding a TM lock on the that table. I queried the corresponding serial# from V$session as well and killed the session:
Sql> alter system kill session ' 130,8764 '; Alter system kill session ' 130,8764 ' *error at line 1:ora-00031:session Marke D for killsql> select * from V$lock where id1=63764; ADDR kaddr SID TY ID1 ID2 lmode REQUEST CTIME BLOCK---------------------------- ------------------------------------------------------------40034278 40034290 63764 0 6 0 2668 0
After hanging-seconds, my kill command gave up (and marked my sessions for kill), but my lock is still not released ... What is now?
This happens when the victim session was so stuck somewhere on an Oracle call that it never have a chance to receive the MES Sage it has been killed. And apparently some resources used can in so call can ' t be released (for whatever reason, it may is by design, it may ju St be a bug).
The below queries against V$session and V$lock show that even though the SESSION have been marked to is in killed status , it ' s still holding a lock:
sql> @usid 130USERNAME SID audsid osuser Machine Program SPID hash_value lastcall STATUS-------------------------------------------------------------------- ---------------------------------------------------------------------------xyz_dev01_owner ' 130,8764 ' 33533 1288249 \xyzhost001 sqlplus.exe 3872 3564023715 4032 Killedsql> SELECT * from V$lock where id1=63764; ADDR kaddr SID TY ID1 ID2 lmode REQUEST CTIME BLOCK----------------------- -----------------------------------------------------------------40034278 40034290 TM 63764 0 6 0 2695 0
Ok, I tried various other options, like kill immediate and disconnect, which should has higher chance to clean up my sess Ion properly:
Sql> alter system kill session ' 130,8764 ' Immediate;alter system kill session ' 130,8764 ' Immediate*error at line 1:ora- 00031:session marked for killsql> alter system disconnect session ' 130,8764 ' Immediate;alter system disconnect session ' 130,8764 ' immediate*error at line 1:ora-00031:session marked for killsql> select * from V$lock where id1=63764; ADDR kaddr SID TY ID1 ID2 lmode REQUEST CTIME BLOCK---------------------------- ------------------------------------------------------------40034278 40034290 63764 0 6 0 2710 0
Still no luck, lock is there.
So I found the OS PID of my server process (or actually OS Thread ID inside Oracle.exe process as I am on Windows) and Used oradebug Short_stack to check where this process was stuck (the output is slightly formatted):
sql> oradebug setospid 3872;oracle pid:18, Windows thread id:3872, image:oracle. EXE (SHAD) sql> oradebug short_stack ; _ksdxfstk+14<-_ksdxcb+1481<-_ssthreadsrgruncallback +428<[email protected]+819<-7c80b680<-00000000<-71a557c4<-71ab4376<-6298540c<- 6298325e<-60a0d931<-609d005f<-609b073d<-609af9endexitagent+202<-_hoxexit+188<-_hogmdis+890 <-_hogdisc+8<-_xupidhs+137<-_upidhs+20<-_kpudtch+305<-_ociserverdetach+12<-_ncodncf+268<- _npidr0+2300<-_npifcc+46<-_qerrmfbu+457<-_ Qerrmfetch +1+1291<-_opiodr+1099<-_ rpidrus+178<-_rpidru+88<-_rpiswu2+426<-_rpidrv+1461<-_psddr0+449<-_psdnal+283<-_pevm_exim+ 153<-_pfrinstr_exim+34<-_pfrrun_no_tool+56<-_pfrrun+781<-_plsql_run+738<-_pr+1099<-_opidrv+ 819<-_sou2o+45<-_opimai_real+112<-_opimai+92<[email protected]+726<-7c80b680
This terse stack shows (start-reading from bottom right-to-left) this process is stuck somewhere "above" Qerrmfetch (Remot e Fetch). I guess the functions starting with "H" above that is heterogenous services functions. By the v$session_wait didn ' t show no WAIT state changes either and the SESSION itself is still constantly waiting f or "HS Message to Agent" event. So, being stuck-a HS call is probably the reason why this session could not clean itself up.
Now, in such situations one normally would proceed with Orakill on Windows or just killing this server process at Unix Lev El (After carefully thinking, what's your ' re about to do). Killing the process makes Pmon to perform the cleanup and Pmon usually does clean all resources up OK. However I didn ' t has access to that Windows server box, so OS kill is out of question.
So, I used another trick. While being connected to the victim process using Oradebug, I-issued this:
Oradebug event immediate crash; ora-00072:process "Windows thread id:3872, image:oracle. EXE (SHAD) "is not activesql>
After waiting for a while (crashdump occurred), Sqlplus reported the target process doesn ' t exist anymore. Thanks to the crash, Pmon had woken up and performed the cleanup.
Let's see if it helped:
sql> oradebug short_stack;ora-00072:process "Windows thread id:3872, image:oracle. EXE (SHAD) "Isn't activesql> select * from V$lock where id1=63764;no rows selectedsql> @usid 130no rows Selectedsql > drop table MYTABLE; Table dropped.
Yippee, I-finally got rid of that session, lock and could drop my table!
Note that I would still prefer killing the processes with Unix kill or Windows Orakill, however there ar E Some bugs with Orakill (search in Metalink) then it doesn ' t always succeed killing the thread properly. And in my case I didn ' t has OS access anyway.
Ideally, the ALTER SYSTEM KILL session command should do all needed cleanup for us, but there is some corner cases involv ing loops, hangs and bugs where this approach won ' t work. There ' s a Metalink note 1020720.102 with a list of the ways for killing Oracle processes/threads from OS level.
nb! If you want to move to the ' New world '-offload your data and workloads to Hadoop, without have to re-write your Existi ng Applications-check out gluent. We are making history! ;-)
Windows Session Management