ORA-01591故障處理

來源:互聯網
上載者:User

早晨到辦公室聽同事說表被鎖了,一試,發現表中某欄位為1111111的行都被鎖了,SELECT都不行。報錯誤ORA-01591,開啟TOAD的Knowledge eXpert,描述很少,只是說由於分散式交易錯誤而造成鎖定。詢問同事,昨天通過一個預存程序調用另一個預存程序出了錯誤,而後者通過透明網關insert一些資料到SQl Server資料庫。
立即想到開啟OEM,誰知道大失所望,進入鎖,根本沒發現相關的對象被鎖定,開始有點鬱悶。轉而檢查會話,該使用者有5個會話,都是INACTIVE,不管三七二十一,全部殺掉。結果依舊,並且鎖也沒有出現。遠程登陸上主機,發現CPU和進程都正常,也沒有發現透明網關進程掛死(之前曾發現TG4SQL在無業務量時也會出現25%左右的CPU,掛死)。
突然想到看看alert.log,經過仔細搜尋,終於發現:

Wed Nov 17 00:00:04 2004
Errors in file d:\oracle\admin\xdcj\udump\xdcj_j006_3020.trc:
ORA-12012: 自動執行作業 82 出錯
ORA-01591: 鎖定已被有問題的分配交易處理6.5.887985掛起
ORA-06512: 在line 6

這正是出錯的地方,往前追溯:

Tue Nov 16 17:35:04 2004
Error 28500 trapped in 2PC on transaction 6.5.887985. Cleaning up.
Error stack returned to user:
ORA-02054: 交易處理6.5.887985有問題
ORA-28500: 串連 ORACLE 到非 Oracle 系統時返回此資訊:
[Transparent gateway for MSSQL]
ORA-02063: 緊接著2 lines(源於ZSMOS_CRM)
Tue Nov 16 17:35:04 2004
DISTRIB TRAN QDCJ.US.ORACLE.COM.5ae32328.6.5.887985
  is local tran 6.5.887985 (hex=06.05.d8cb1)
  insert pending prepared tran, scn=6606197672830 (hex=602.2010cb7e)
Tue Nov 16 17:35:07 2004
Errors in file d:\oracle\admin\xdcj\bdump\xdcj_reco_3024.trc:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
[Transparent gateway for MSSQL][Microsoft][ODBC SQL Server Driver][SQL Server]使用者 'RECOVER' 登入失敗。 (SQL State: 28000; SQL Code: 18456)
ORA-02063: preceding 2 lines from ZSMOS_CRM

Tue Nov 16 17:35:12 2004
Errors in file d:\oracle\admin\xdcj\bdump\xdcj_reco_3024.trc:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
[Transparent gateway for MSSQL][Microsoft][ODBC SQL Server Driver][SQL Server]使用者 'RECOVER' 登入失敗。 (SQL State: 28000; SQL Code: 18456)
ORA-02063: preceding 2 lines from ZSMOS_CRM

這就是事發地點了。看來是昨天下午遠程事務失敗,但是又沒有返回造成分散式交易掛死,從而鎖定了行。終於找到了詳細的錯誤ORA-02054,進入TOAD一查,說是要等待或者提交該事務,可是怎麼操作呢。還是開啟官方文檔搜尋相關內容,在Adminstrator Guide中發現如下內容:
Discovering Problems with a Two-Phase Commit
The user application that commits a distributed transaction is informed of a problem by one of the following error messages:

ORA-02050: transaction ID rolled back,
           some remote dbs may be in-doubt
ORA-02051: transaction ID committed,
           some remote dbs may be in-doubt
ORA-02054: transaction ID in-doubt


A robust application should save information about a transaction if it receives any of the above errors. This information can be used later if manual distributed transaction recovery is desired.

No action is required by the administrator of any node that has one or more in-doubt distributed transactions due to a network or system failure. The automatic recovery features of Oracle transparently complete any in-doubt transaction so that the same outcome occurs on all nodes of a session tree (that is, all commit or all roll back) after the network or system failure is resolved.

In extended outages, however, you can force the commit or rollback of a transaction to release any locked data. Applications must account for such possibilities.

Determining Whether to Perform a Manual Override
Override a specific in-doubt transaction manually only when one of the following situations exists:

The in-doubt transaction locks data that is required by other transactions. This situation occurs when the ORA-01591 error message interferes with user transactions.
An in-doubt transaction prevents the extents of a rollback segment from being used by other transactions. The first portion of an in-doubt distributed transaction's local transaction ID corresponds to the ID of the rollback segment, as listed by the data dictionary views DBA_2PC_PENDING and DBA_ROLLBACK_SEGS.
The failure preventing the two-phase commit phases to complete cannot be corrected in an acceptable time period. Examples of such cases include a telecommunication network that has been damaged or a damaged database that requires a long recovery time.
Normally, you should make a decision to locally force an in-doubt distributed transaction in consultation with administrators at other locations. A wrong decision can lead to database inconsistencies that can be difficult to trace and that you must manually correct.

If the conditions above do not apply, always allow the automatic recovery features of Oracle to complete the transaction. If any of the above criteria are met, however, consider a local override of the in-doubt transaction.

看來是建議差不多,後面Oracle總是試圖登入SQl Server就是要自動回復,可是總不成功。察看視圖DBA_2PC_PENDING確實發現了該事務的痕迹。要怎樣操作呢?

Manually Committing an In-Doubt Transaction
Before attempting to commit the transaction, ensure that you have the proper privileges. Note the following requirements:

If the transaction was committed by... Then you must have this privilege...
You
 FORCE TRANSACTION
 
Another user
 FORCE ANY TRANSACTION
 

Committing Using Only the Transaction ID
The following SQL statement commits an in-doubt transaction:

COMMIT FORCE 'transaction_id';


The variable transaction_id is the identifier of the transaction as specified in either the LOCAL_TRAN_ID or GLOBAL_TRAN_ID columns of the DBA_2PC_PENDING data dictionary view.

For example, assume that you query DBA_2PC_PENDING and determine that LOCAL_TRAN_ID for a distributed transaction is 1:45.13.

You then issue the following SQL statement to force the commit of this in-doubt transaction:

COMMIT FORCE '1.45.13';

Committing Using an SCN
Optionally, you can specify the SCN for the transaction when forcing a transaction to commit. This feature allows you to commit an in-doubt transaction with the SCN assigned when it was committed at other nodes.

Consequently, you maintain the synchronized commit time of the distributed transaction even if there is a failure. Specify an SCN only when you can determine the SCN of the same transaction already committed at another node.

For example, assume you want to manually commit a transaction with the following global transaction ID:

SALES.ACME.COM.55d1c563.1.93.29

First, query the DBA_2PC_PENDING view of a remote database also involved with the transaction in question. Note the SCN used for the commit of the transaction at that node. Specify the SCN when committing the transaction at the local node. For example, if the SCN is 829381993, issue:

COMMIT FORCE 'SALES.ACME.COM.55d1c563.1.93.29', 829381993;

See Also:
Oracle9i SQL Reference for more information about using the COMMIT statement
 

Manually Rolling Back an In-Doubt Transaction
Before attempting to roll back the in-doubt distributed transaction, ensure that you have the proper privileges. Note the following requirements:

If the transaction was committed by... Then you must have this privilege...
You
 FORCE TRANSACTION
 
Another user
 FORCE ANY TRANSACTION
 

The following SQL statement rolls back an in-doubt transaction:

ROLLBACK FORCE 'transaction_id';


The variable transaction_id is the identifier of the transaction as specified in either the LOCAL_TRAN_ID or GLOBAL_TRAN_ID columns of the DBA_2PC_PENDING data dictionary view.

For example, to roll back the in-doubt transaction with the local transaction ID of 2.9.4, use the following statement:

ROLLBACK FORCE '2.9.4';

於是登陸資料庫
COMMIT FORCE '6.5.887985';
然後查看DBA_2PC_PENDING發現狀態已經改為'COMMIT FORCE',SELECT該表相關行,一切正常。至此,故障解決。
總體來看,直接INSERT ... TABLENAME@SQLDBLK還是很危險的,遇上不能正常返回就出問題了。Oracle的文檔是推薦使用包或者預存程序來解決,此後建議同事改用此方法,目前已經測試通過。


相關文章

Cloud Intelligence Leading the Digital Future

Alibaba Cloud ACtivate Online Conference, Nov. 20th & 21st, 2019 (UTC+08)

Register Now >

Starter Package

SSD Cloud server and data transfer for only $2.50 a month

Get Started >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。