前陣子的alert日誌獲得了所需堆尺寸的大小超出指定闕值的提示,即Heap size 80869K exceeds notification threshold (51200K)。從Oracle 10.2.0.2之後該闕值已經增加到了50MB。通過LRU演算法理論上來說應該是足夠的。這個問題是由於SGA中片段太多,一時間無法找到容納當前語句於是拋出該提示資訊。需要注意的是此時資料庫並沒有給出ora-04031錯誤提示。下面給出具體的描述。
一、alert 日誌提取的提示資訊
Fri Feb 7 19:47:21 2014Memory Notification: Library Cache Object loaded into SGAHeap size 80869K exceeds notification threshold (51200K)Details in trace file /u02/database/UK3200/udump/UK3200_ora_6240.trcKGL object name :INSERT INTO ACCOUNTING_FILE_DTL_TBL (ACCOUNT_CODE_ID, ACCOUNT_CODE, ACCOUNT_NAME, VOUCHER_NO, ACC_NUM, CURR_CD, ACCOUNT_PERIOD, VALUE_DATE, TRANS_DATE, ACC_POS_HIST_ID, DEBIT_CREDIT, TRANS_DESC, TRANS_TYPE_CD, AE_ID, AMOUNT, CATEGORY, FX, DESCRIPTION, CUST_SUB_TYPE,VOU_TYPE, DISPLAY_TYPE, IS_HIDE_WHEN_ZERO, TYPE_GRP, DESC_ACCTING_STD, ACCOUNT_JV_TYPE_ID, JV_DESC, INPUT_DATE ) WITH VOU_TBL AS ( SELECT A.ACC_NUM, A.CURR_CD, SUBSTR (:B2 , 1, 6) AS ACCOUNT_PERIOD, A.VALUE_DATE, A.APPROVAL_DATE, A.TRANSACTION_CMon Feb 10 18:22:32 2014Memory Notification: Library Cache Object loaded into SGAHeap size 80865K exceeds notification threshold (51200K)KGL object name :INSERT INTO ACCOUNTING_FILE_DTL_TBL (ACCOUNT_CODE_ID, ACCOUNT_CODE, ACCOUNT_NAME, VOUCHER_NO, ACC_NUM, CURR_CD, ACCOUNT_PERIOD, VALUE_DATE, TRANS_DATE, ACC_POS_HIST_ID, DEBIT_CREDIT, TRANS_DESC, TRANS_TYPE_CD, AE_ID, AMOUNT, CATEGORY, FX, DESCRIPTION, CUST_SUB_TYPE,VOU_TYPE, DISPLAY_TYPE, IS_HIDE_WHEN_ZERO, TYPE_GRP, DESC_ACCTING_STD, ACCOUNT_JV_TYPE_ID, JV_DESC, INPUT_DATE ) WITH VOU_TBL AS ( SELECT A.ACC_NUM, A.CURR_CD, SUBSTR (:B2 , 1, 6) AS ACCOUNT_PERIOD, A.VALUE_DATE, A.APPROVAL_DATE, A.TRANSACTION_C#udump下的記錄檔達到724MB,這個是這個提示累計下來的結果oracle@linux-1234:~> ls -hltr /u02/database/UK3200/udump/UK3200_ora_6240.trc-rw-r----- 1 oracle oinstall 724M 2014-02-07 19:47 /u02/database/UK3200/udump/UK3200_ora_6240.trc
二、故障分析
對於一個需要執行的SQL或者PL/SQL時,該代碼需要從library cache中分配一塊連續的空閑空間來解析語句。Oracle首先掃描shared pool尋找空閑記憶體,如果沒有發現大小正好合適的空閑chunk,就尋找更大的chunk,如果找到比請求的大小更大的空閑chunk,則將它分裂,多餘部分繼續放到空閑列表中。於是產生了片段。系統經過長時間運行後,就會產生大量小的記憶體片段。當請求分配一個較大的記憶體塊時,儘管shared pool總空閑空間還很大,但是沒有一個單獨的連續空閑塊能滿足需要,則系統會根據LRU演算法來淘汰那些不再使用的語句,這之後依舊一直無法分配到記憶體空間,則就可能產生4031錯誤。
通常情況下,如果檢查發現shared_pool_size足夠大,那4031錯誤一般就是由於片段太多引起的。
儘管片段無法避免,但應儘可能減少片段。以下是可能產生片段的一些潛在因素:
沒有使用共用SQL;
過多的沒有必要的解析調用(軟解析);
沒有使用綁定變數
對於我們的這個情況是記憶體夠用,一時無法獲得連續的空閑記憶體塊。不過該insert語句的確比較長,接近400行。
關於shared_pool與SGA的分析
--環境SQL> select * from v$version where rownum<2;BANNER----------------------------------------------------------------Oracle Database 10g Release 10.2.0.3.0 - 64bit Production--下面是關於對象重載的查詢結果,儘管存在一些INVALIDATIONS,總體表現良好SQL> @reload_ratio.sqlNAMESPACE GETS GETHITS GETHIT_RATIO PINS PINHITS PINHIT_RATIO RELOADS INVALIDATIONS--------------- ---------- ---------- ------------ ---------- ---------- ------------ ---------- -------------SQL AREA 607614 100098 16.47 268976344 267458010 99.44 103083 101598TABLE/PROCEDURE 4601210 4474456 97.25 29426797 28904427 98.22 178625 0BODY 39663 34617 87.28 2503211 2490492 99.49 6638 0TRIGGER 72063 70249 97.48 1631610 1628323 99.8 1470 0INDEX 94655 51973 54.91 469366 380268 81.02 3599 0CLUSTER 27630 27383 99.11 61368 60883 99.21 238 0OBJECT 0 0 100 0 0 100 0 0PIPE 0 0 100 0 0 100 0 0JAVA SOURCE 160 150 93.75 239 102 42.68 84 0JAVA RESOURCE 81 75 92.59 243 160 65.84 27 0JAVA DATA 162 159 98.15 1130 1126 99.65 0 011 rows selected.--關於library cache的命中率SQL> SELECT SUM (pins) "Executions", 2 SUM (reloads) "Cache Misses while Executing", 3 ROUND ( (SUM (pins) / (SUM (reloads) + SUM (pins))) * 100, 2) "Hit Ratio, %" 4 FROM V$LIBRARYCACHE;Executions Cache Misses while Executing Hit Ratio, %---------- ---------------------------- ------------ 303544356 294260 99.9--當前系統此時可用的空閑記憶體 SQL> SELECT pool,name,bytes/1024/1024 FROM v$sgastat WHERE name LIKE '%free memory%' AND pool = 'shared pool';POOL NAME BYTES/1024/1024------------ -------------------------- ---------------shared pool free memory 91.7747498--有關shared_pool 的advice,從結果來看,當前的shared_pool為340mbSQL> SELECT shared_pool_size_for_estimate est_size, 2 shared_pool_size_factor size_factor, 3 estd_lc_size, 4 estd_lc_memory_objects obj_cnt,estd_lc_time_saved_factor sav_factor 5 FROM v$shared_pool_advice;EST_SIZE SIZE_FACTOR ESTD_LC_SIZE OBJ_CNT SAV_FACTOR---------- ----------- ------------ ---------- ---------- 196 .5765 36 2822 .803 232 .6824 71 4400 .8567 268 .7882 106 5114 .9078 304 .8941 142 6095 .9532 340 1 177 7757 1 376 1.1059 212 9903 1.0434 412 1.2118 248 12080 1.0874 448 1.3176 283 14288 1.1382 484 1.4235 317 16277 1.1818 520 1.5294 351 16612 1.2166 556 1.6353 386 18515 1.2442 592 1.7412 421 18663 1.2679 628 1.8471 456 18885 1.2877 664 1.9529 490 19127 1.3033 700 2.0588 525 19255 1.3164--當前的sga advice結果即便是增加到715mb,ESTD_DB_TIME時間為219849,效果並不理想--也就是說增加sga的大小對當前資料庫作用不大 --Author: Leshami--Blog : http://blog.csdn.net/leshami SQL> select sga_size,sga_size_factor,estd_db_time from v$sga_target_advice order by 1; SGA_SIZE SGA_SIZE_FACTOR ESTD_DB_TIME---------- --------------- ------------ 286 .5 675749 429 .75 257214 572 1 222542 715 1.25 219849 858 1.5 219048 1001 1.75 219048 1144 2 219048
三、Oracle提出的解決方案
Memory Notification: Library Cache Object Loaded Into Sga (Doc ID 330239.1) To Bottom
APPLIES TO:
Oracle Database - Enterprise Edition - Version 10.2.0.1 and later
Information in this document applies to any platform.
Oracle Server Enterprise Edition
.
SYMPTOMS
The following messages are reported in alert.log after 10g Release 2 is installed.
Memory Notification: Library Cache Object loaded into SGA
Heap size 2294K exceeds notification threshold (2048K)
CHANGES
Installed / Upgraded to 10g Release 2
CAUSE
These are warning messages that should not cause the program responsible for these errors to fail. They appear as a result of new event messaging mechanism and memory manager in 10g Release 2.
The meaning is that the process is just spending a lot of time in finding free memory extents during an allocate as the memory may be heavily fragmented. Fragmentation in memory is impossible to eliminate completely, however, continued messages of large allocations in memory indicate there are tuning opportunities on the application.
The messages do not imply that an ORA-4031 is about to happen.
SOLUTION
In 10g we have a new undocumented parameter that sets the KGL heap size warning threshold. This parameter was not present in 10gR1. Warnings are written if heap size exceeds this threshold.
Set _kgl_large_heap_warning_threshold to a reasonable high value or zero to prevent these warning messages. Value needs to be set in bytes.
If you want to set this to 8192 (8192 * 1024) and are using an spfile:
(logged in as "/ as sysdba")
SQL> alter system set "_kgl_large_heap_warning_threshold"=8388608 scope=spfile ;
SQL> shutdown immediate
SQL> startup
If using an "old-style" init parameter,
Edit the init parameter file and add
_kgl_large_heap_warning_threshold=8388608
NOTE: The default threshold in 10.2.0.1 is 2M. So these messages could show up frequently in some application environments.
In 10.2.0.2, the threshold was increased to 50MB after regression tests, so this should be a reasonable and recommended value.
最終的解決:
1、根據meatlink修改隱藏參數,不能確定該設定是否存在負面影響,也沒有google到這個問題的答案2、調整代碼(略))SQL> @hidden_paraEnter value for para: large_heapKSPPINM KSPPSTVL DESCRIB--------------------------------- -------------------- ---------------------------------------------_kgl_large_heap_warning_thresh old 52428800 maximum heap size before KGL writes warnings to the alert logSQL> alter system set "_kgl_large_heap_warning_threshold"=82809856 scope=spfile ;