老白講解處理ora-01591

來源:互聯網
上載者:User

資料庫總是報錯:
Tue May  6 13:44:47 2008                                                                                                           
SMON: about to recover undo segment 119                                                                                            
ORACLE Instance topcs2 (pid = 11) - Error 1591 encountered while recovering tran                                                   
saction (119, 18) on object 2309045.      
Errors in file /u2/oracle/ora92/rdbms/log/topcs2_ora_2899.trc:                                                                     
ORA-01591: lock held by in-doubt distributed transaction 108.28.46269                                                              
SMON: mark undo segment 119 as needs recovery                                                                                      

我問小孫這個介面是不是Tuxedo的,小孫說是Tuxedo 6.5的應用。這種XA應用出問題導致ORA-1591的問題
也算是常見問題了,我馬上尋找119號復原段裡是不是有死事務:
SELECT KTUXEUSN,KTUXESLT,KTUXESQN,
KTUXESTA Status,KTUXECFL Flags
FROM x$ktuxe
WHERE ktuxesta!='INACTIVE'
AND ktuxeusn=119;

查詢結果是沒有任何記錄。我一看原來in-doubt 的分散式交易是108.28.46269,於是馬上查詢108號復原段裡
是不是有死事務:
SELECT KTUXEUSN, KTUXESLT, KTUXESQN, 
               KTUXESTA Status, KTUXECFL Flags
  FROM x$ktuxe
WHERE ktuxesta!='INACTIVE'
     AND ktuxeusn=108;

     KTUXEUSN   KTUXESLT   KTUXESQN STATUS           FLAGS                                                                            
   ---------- ---------- ---------- ---------------- ------------------------                                                         
       108         28      46269 PREPARED         SCO|COL|REV|DEAD 

確實存在一個Prepared 狀態的死事務。於是馬上檢查了一下pending_trans$表:
SELECT local_tran_id, global_tran_fmt, global_oracle_id,
   global_foreign_id, state, status, heuristic_dflt,
          session_vector, reco_vector,
        global_commit#
        FROM PENDING_TRANS$;

LOCAL_TRAN_ID GLOBAL_TRAN_FMT GLOBAL_ORACLE_ID GLOBAL_FOREIGN_ID STATE STATUS HEURISTIC_DFLT SESSION_VECTOR RECO_VECTOR GLOBAL_COMMIT#

70.6.108873 306206 TOPCS.c7dd20c6.70.6.108873  forced rollback P  00000001 00000001 1486865870
96.42.84009 306206 TOPCS.c7dd20c6.96.42.84009  forced rollback P  00000001 00000001 1286932454
85.10.101067 306206 TOPCS.c7dd20c6.85.10.101067  forced rollback P  00000001 00000001 1487167659
9.35.29156 306206 TOPCS.c7dd20c6.9.35.29156  collecting P  00000001 00000001 1672793351
80.8.132177 306206 TOPCS.c7dd20c6.80.8.132177  forced rollback P  00000001 00000001 1679427495
18.32.162778 306206 TOPCS.c7dd20c6.18.32.162778  forced commit P  00000001 00000001
64.21.136442 306206 TOPCS.c7dd20c6.64.21.136442  forced commit P  00000001 00000001
73.11.124822 306206 TOPCS.c7dd20c6.73.11.124822  forced rollback P  00000001 00000001 1731227073
63.29.148558 306206 TOPCS.c7dd20c6.63.29.148558  forced commit P  00000001 00000001 

發現並不存在108.28.46249這個事務。看樣子是系統字典表出現問題了,導致這個分散式交易無法被自動回退。首先看看這個
事務鎖定的對象是什麼:

Select NAME from OBJ$ WHERE OBJ#=2309045;
NAME
---------------------------------------
TPTB_IDX
這個一個索引,基表是T_TPTB.對這張表進行全表掃描,看會發生些什麼:
SELECT /*+ full(a)  */ count(*) FROM T_TPTB a;
ORA-01591:lock held by in-doubt distributed transaction 73.11.124822

這張表被另外一個分散式交易73.11.124822 鎖定了。查看下73號復原段的情況:

SELECT KTUXEUSN, KTUXESLT, KTUXESQN, 
               KTUXESTA Status, KTUXECFL Flags
  FROM x$ktuxe
WHERE ktuxesta!='INACTIVE'
     AND ktuxeusn=73;

 KTUXEUSN   KTUXESLT   KTUXESQN STATUS           FLAGS                                                                            
---------- ---------- ---------- ---------------- ------------------------                                                         
        73         11     124822 PREPARED         SCO|COL|REV|DEAD  

執行ROLLBACK FORCE 73.11.124822時出現了錯誤:

SQL>ROLLBACK FORCE '73.11.124822'
ERROR at line 1:
ORA-02058: no prepared transaction found with ID 73.11.124822

在 pending_trans$ 查看目前這個事務的狀態不是PREPARED,因此需要首先在pending_trans$裡將這個事務的狀態設定為
prepared:
UPDATE pending_trans$SET STATE='prepared',
STATUS='p'
WHERE local_tran_id='73.11.124822';
COMMIT;

執行了這個語句後,再次執行 ROLLBACK FORCE  '73.11.124822';
SQL>ROLLBACK FORCE '73.11.124822';
ROLLBACK FORCE '73.11.124822'                                                                                                      
*                                                                                                                                  
ERROR at line 1:                                                                                                                   
ORA-01591: lock held by in-doubt distributed transaction 108.28.46269

看樣子這個事務被我們最初發現的那個事務鎖住了。這兩個事務一個鎖住了表,一個鎖住了索引,所以只能看看能不能強制提交了:
SQL>UPDATE pending_trans$ SET STATE='prepared',
STATUS='P'
where local_tran_id='73.11.124822';
SQL>commit;
SQL>commit force '73.11.124822';
使用這個命令,這個事務被成功的強制提交了。下面來處理108復原段的那個事務。由於這個事務在pending_trans$中缺少記錄,所以首先
將這個事務清理掉:
SQL>exec dbms_transaction.purge_lost_db_entry('108.28.46269')
然後手工插入相關記錄:
SQL>alter system disable distributed recovery;
SQL> insert into pending_trans$ (
        LOCAL_TRAN_ID,
        GLOBAL_TRAN_FMT,
        GLOBAL_ORACLE_ID,
        STATE,
        STATUS,
        SESSION_VECTOR,
        RECO_VECTOR,
        TYPE#,
        FAIL_TIME,
        RECO_TIME)
    values('108.28.46269',
        306206,     
        'XXXXXXX.12345.1.2.3',
        'prepared','P',         
        hextoraw( '00000001' ),
        hextoraw( '00000000' ),
        0, sysdate, sysdate );

SQL>insert into pending_sessions$
    values( '108.28.46269',
        1, hextoraw('05004F003A1500000104'),
        'C', 0, 30258592, '',
        146
      );

SQL>Commit;

然後再次強制提交:
SQL>commit force '108.28.46249';
commit compelte.
完成提交後,小孫再次檢查了一下那個應用,發現問題解決了。

 

主題:ORA-1591的補充說明

前幾天流沙的ORA-1591問題,由於是QQ對話,可能對於對ORA-1591缺乏經驗的人來說不容易看懂,本帖針對這個問題進行進一步的介紹。這樣參考那個執行個體就可以更加清晰的瞭解ORA-1591問題了。

    ORA-01591: "lock held by in-doubt distributed transaction %s"
        Cause: Trying to access resource that is locked by a dead
               two-phase commit transaction that is in prepared state.
       Action: DBA should query the pending_trans$ and related tables,
               and attempt to repair network connection(s) to
               coordinator and commit point. If timely repair is not
               possible, DBA should contact DBA at commit point if
               known or end user for correct outcome, or use heuristic
               default if given to issue a heuristic commit or abort
               command to finalize the local portion of the
               distributed transaction.
總的來說ORA-1591的產生原因是分散式交易失敗,失敗的原因很多,比如網路問題、XA資源管理員存在BUG等,都可能引起失敗。一旦分散式交易失敗,本地事務中,如果有一個事務掙處於活躍狀態,那麼該事務相關的資料就會被鎖定(無論讀寫都會被鎖定),如果訪問這個事務關聯的資料,就會報ORA-1591。一般情況下,ORA-1591可以自動的解開,SMON會在一定時間周期內檢查DBA_2PC_PENDING,找出需要回退的事務,並進行自動的恢複。這裡就有幾個問題,由於分散式交易逾時判斷以及RECO處理周期的關係,一般來說事務自動回復的時間為1分鐘以上,較長的可以達到5-10分鐘。可能會對生產系統造成比較大的影響。為了加快解鎖,可以使用手工處理。這個時候可以使用ROLLBACK FORCE或者COMMIT FORCE。

有時候由於分散式交易恢複出現故障,會出現資料字典不一致,此時該分散式交易就無法正常解除,需要手工幹預來處理。

分析方法:

1、檢查分散式交易的狀態:

SELECT LOCAL_TRAN_ID, GLOBAL_TRAN_ID, STATE, MIXED, HOST, COMMIT#
FROM DBA_2PC_PENDING
WHERE LOCAL_TRAN_ID = '報錯的本地事務號'

2、檢查分散式交易相關其他節點的情況:

SELECT LOCAL_TRAN_ID, IN_OUT, DATABASE, INTERFACE
FROM DBA_2PC_NEIGHBORS;

3、檢查本地復原段:

      SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */
             KTUXESTA Status,
             KTUXECFL Flags
      FROM x$ktuxe
      WHERE ktuxesta!='INACTIVE'
            AND ktuxeusn= 復原段編碼
最常見的是DBA_2PC_PENDING中存在一條PREPARED狀態的記錄,在復原段中也發現一個是PREPARED狀態的活躍事務,但是事務有DEAD標誌。這種情況,可以根據事務號進行COMMIT FORCE或者ROLLBACK FORCE.
另外一種情況是DBA_2PC_PENDING中也有記錄,但是狀態和復原段中的狀態不同,這個時候,手工修改sys.pending_trans$中的狀態值,然後再進行ROLLBACK FORCE/COMMIT FORCE。
如果在DBA_2PC_PENDING中找不到記錄,那麼要檢查sys.pending_trans$、sys.pending_sessions$和sys.pending_sub_sessions$。如果是記錄丟失,就可以手工插入記錄,然後進行ROLLBACK/COMMIT FORCE
(參考流沙那個問題的文章)
    insert into pending_trans$ (
        LOCAL_TRAN_ID,
        GLOBAL_TRAN_FMT,
        GLOBAL_ORACLE_ID,
        STATE,
        STATUS,
        SESSION_VECTOR,
        RECO_VECTOR,
        TYPE#,
        FAIL_TIME,
        RECO_TIME)
    values('transid',
        306206,       固定值/
        'XXXXXXX.12345.1.2.3', 隨意取值
        'prepared','P',          表示PREPARED狀態
        hextoraw( '00000001' ),  /*  constant.                              */
        hextoraw( '00000000' ),  /*                                         */
        0, sysdate, sysdate );

    insert into pending_sessions$
    values( '事務號',
        1, hextoraw('05004F003A1500000104'),
        'C', 0, 30258592, '',
        146
      );

    commit;
4、做完ROLLBACK FORCE/COMMIT FORCE後,如果DBA_2PC_PENDING中的記錄未清除或者DBA_2PC_PENDING中有記錄,但是復原段中已經無記錄。那麼使用:
alter session set "_smu_debug_mode" = 4;  //如果9i,使用AUM需要設定,否則後面會出錯commit; exec dbms_transaction.purge_lost_db_entry( '事務號' ) 實際情況要複雜的多,可以參考401302.1

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.