Statspack之十四-”log file sync” 等待事件（轉貼）

最後更新：2018-12-05 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

當一個使用者提交(commits)或者復原(rollback),session的redo資訊需要寫出到redo logfile中.
使用者進程將通知LGWR執行寫出操作,LGWR完成任務以後會通知使用者進程.
這個等待事件就是指使用者進程等待LGWR的寫完成通知.

對於復原操作，該事件記錄從使用者發出rollback命令到復原完成的時間.

如果該等待過多，可能說明LGWR的寫出效率低下，或者系統提交過於頻繁.
針對該問題，可以關注:
log file parallel write等待事件
user commits,user rollback等統計資訊可以用於觀察提交或復原次數

解決方案:
1.提高LGWR效能
盡量使用快速磁碟，不要把redo log file存放在raid 5的磁碟上
2.使用批量提交
3.適當使用NOLOGGING/UNRECOVERABLE等選項

可以通過如下公式計算平均redo寫大小：

avg.redo write size = (Redo block written/redo writes)*512 bytes

如果系統產生redo很多，而每次寫的較少，一般說明LGWR被過於頻繁的啟用了.
可能導致過多的redo相關latch的競爭,而且Oracle可能無法有效使用piggyback的功能.

我們從一個statspack中提取一些資料來研究一下這個問題.

1.主要資訊

DB Name         DB Id    Instance     Inst Num Release     OPS Host------------ ----------- ------------ -------- ----------- --- ------------DB           1222010599  oracle              1 8.1.7.4.5   NO  sun                Snap Id     Snap Time      Sessions                ------- ------------------ -------- Begin Snap:       3473 13-Oct-04 13:43:00      540   End Snap:       3475 13-Oct-04 14:07:28      540    Elapsed:                  24.47 (mins)Cache Sizes~~~~~~~~~~~           db_block_buffers:     102400          log_buffer:   20971520              db_block_size:       8192    shared_pool_size:       600MLoad Profile~~~~~~~~~~~~                            Per Second       Per Transaction                                   ---------------       ---------------                  Redo size:             28,458.11              2,852.03                  ......

2.等待事件

Event                               Waits   Timeouts  Time (cs)    (ms)   /txn---------------------------- ------------ ---------- ----------- ------ ------log file sync                      14,466          2       4,150      3    1.0db file sequential read            17,202          0       2,869      2    1.2latch free                         24,841     13,489       2,072      1    1.7 direct path write                     121          0       1,455    120    0.0db file parallel write              1,314          0       1,383     11    0.1log file sequential read            1,540          0          63      0    0.1....log file switch completion              1          0           3     30    0.0refresh controlfile command            23          0           1      0    0.0LGWR wait for redo copy                46          0           0      0    0.0....log file single write                   4          0           0      0    0.0

我們看到，這裡log file sync和db file parallel write等待同時出現了.
顯然log file sync在等待db file parallel write的完成.

這裡磁碟IO肯定存在了瓶頸，實際使用者的redo和資料檔案同時存放在Raid的磁碟上，存在效能問題.
需要調整.

3.統計資訊

 Statistic                                    Total   per Second    per Trans--------------------------------- ---------------- ------------ ------------....redo blocks written                         93,853         63.9          6.4redo buffer allocation retries                   1          0.0          0.0redo entries                               135,837         92.5          9.3redo log space requests                          1          0.0          0.0redo log space wait time                         3          0.0          0.0redo ordering marks                              0          0.0          0.0redo size                               41,776,508     28,458.1      2,852.0redo synch time                              4,174          2.8          0.3redo synch writes                           14,198          9.7          1.0redo wastage                             4,769,200      3,248.8        325.6redo write time                              3,698          2.5          0.3redo writer latching time                        0          0.0          0.0redo writes                                 14,572          9.9          1.0....sorts (disk)                                     4          0.0          0.0sorts (memory)                             179,856        122.5         12.3sorts (rows)                             2,750,980      1,874.0        187.8....transaction rollbacks                           36          0.0          0.0transaction tables consistent rea                0          0.0          0.0transaction tables consistent rea                0          0.0          0.0user calls                               1,390,718        947.4         94.9user commits                                14,136          9.6          1.0user rollbacks                                 512          0.4          0.0write clones created in backgroun                0          0.0          0.0write clones created in foregroun               11          0.0          0.0          -------------------------------------------------------------

avg.redo write size = (Redo block written/redo writes)*512 bytes      = ( 93,853 / 14,572 )*512       = 3K

這個平均過小了，說明系統的提交過於頻繁.

 Latch Sleep breakdown for DB: DPSHDB  Instance: dpshdb  Snaps: 3473 -3475-> ordered by misses desc                                Get                                  Spin &Latch Name                    Requests         Misses      Sleeps Sleeps 1->4-------------------------- -------------- ----------- ----------- ------------row cache objects              12,257,850     113,299          64 113235/64/0/                                                                  0/0shared pool                     3,690,715      60,279      15,857 52484/588/65                                                                  46/661/0library cache                   4,912,465      29,454       8,876 23823/2682/2                                                                   733/216/0cache buffers chains           10,314,526       2,856          33 2823/33/0/0/                                                                  0redo writing                       76,550         937           1 936/1/0/0/0session idle bit                2,871,949         225           1 224/1/0/0/0messages                          107,950         159           2 157/2/0/0/0session allocation                184,386          44           6 38/6/0/0/0checkpoint queue latch             96,583           1           1 0/1/0/0/0          -------------------------------------------------------------

由於過渡頻繁的提交，LGWR過度頻繁的啟用，我們看到這裡出現了redo writing的latch競爭.

關於redo writing競爭你可以在steve的網站找到詳細的介紹:
http://www.ixora.com.au/notes/lgwr_latching.htm

轉引如下:

When LGWR wakes up, it first takes the redo writing latch to update the SGA variable that shows whether it is active. This prevents other Oracle processes from posting LGWR needlessly. LGWR then takes the redo allocation latch to determine how much redo might be available to write (subject to the release of the redo copy latches). If none, it takes the redo writing latch again to record that it is no longer active, before starting another rdbms ipc message wait.
If there is redo to write, LGWR then inspects the latch recovery areas for the redo copy latches (without taking the latches) to determine whether there are any incomplete copies into the log buffer. For incomplete copies above the sync RBA, LGWR just defers the writing of that block and subsequent log buffer blocks. For incomplete copies below the sync RBA, LGWR sleeps on a LGWR wait for redo copy wait event, and is posted when the required copy latches have been released. The time taken by LGWR to take the redo writing and redo allocation latches and to wait for the redo copy latches is accumulated in the redo writer latching time statistic.

(Prior to release 8i, foreground processes held the redo copy latches more briefly because they did not retain them for the application of the change vectors. Therefore, LGWR would instead attempt to assure itself that there were no ongoing copies into the log buffer by taking all the redo copy latches.)

After each redo write has completed, LGWR takes the redo allocation latch again in order to update the SGA variable containing the base disk block for the log buffer. This effectively frees the log buffer blocks that have just been written, so that they may be reused.

轉自：http://www.eygle.com/statspack/statspack14-LogFileSync.htm

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Statspack之十四-”log file sync” 等待事件（轉貼）

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support