A Case Study of regular missing histogram Information

Source: Internet
Author: User
The developer feedback system is particularly slow, the system is oracle11.2.0.3 of solarisiscsi $ vmstat35kthrmemorypagediskfaultscpurbwswapfreeremfpipofrdesrm0m1m3m4insycsussyid000278333841071424172108612100312-0365773067835

The feedback system from developers is particularly slow, the system is solaris iSCSI $ vmstat 3 5 kthr memory page disk faults cpu r B w swap free re mf pi po fr de sr m0 m 1 m3 m4 in sy cs us sy id 0 for oracle 11.2.0.3. 0 0 27833384 1071424 172 1 2 1 0 0 3 1 2-0 1086 3657 35

Developers report that the system is particularly slow, and the system is oracle 11.2.0.3's solaris iSCSI

$ vmstat 3 5
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr m0 m1 m3 m4 in sy cs us sy id
0 0 0 27833384 1071424 172 1086 1 2 1 0 0 3 1 2 -0 3657 730678 3594 5 3 92
110 0 0 27838784 1092080 199 1558 0 0 0 0 0 0 0 0 0 7422 141990 6792 96 4 0
115 0 0 27837328 1089032 120 1527 0 0 0 0 0 0 0 0 0 7534 197852 7086 96 4 0
111 0 0 27836128 1086928 104 1547 0 0 0 0 0 0 0 0 0 7667 159280 6972 95 5 0
114 0 0 27833712 1083248 185 2274 0 0 0 0 0 0 0 0 0 7439 166742 6730 96 4 0

bash-3.2$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Aug 4 10:30:26 2014

Copyright (c) 1982, 2011, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> select event,count(*) from v$session where wait_class
'Idle' group by event;

EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file sequential read 2
SQL*Net message to client 5
read by other session 2
latch free 1
latch: cache buffers chains 107

Cpu resources are basically exhausted, and the culprit of cpu resources consumption is that the process is waiting to get the latch of the hash bucket. Generally, the cpu consumption is divided into two parts: 1. Process blocking, this is mostly because of obtaining latch, such as cbc latch, shared pool latch, library cache latch, row cache object latch, or library cache pin and library cache lock, also, the mutex mechanism introduced in 10.2.0.2 causes 2: IO problems, such as a large number of concurrent full table scans and unreasonable table connection methods and sequences.

Use the following SQL statement to obtain the corresponding SQL statement that the cbc waits:
SELECT distinct c. sid,
C. event,
B. username,
A. SQL _text,
A. SQL _id
FROM v $ SQL a, v $ session B, v $ session_Wait c
WHERE c. wait_class
'Idle'
AND c. sid = B. sid
AND B. SQL _hash_value = a. hash_value
Order by c. event desc, a. SQL _text desc

After an SQL statement is diagnosed, it is found that the statistical information of this SQL statement is faulty. the histogram information of a column corresponding to this table is none, resulting in an unreasonable execution plan, this system has been maintained by small fish and has encountered similar problems before.

The SQL statement adjustment part is not listed here. It is strange that the SQL statement corresponding to this frequent table often goes wrong because the corresponding column of the table lacks a histogram, And the histogram information is analyzed manually each time:

execute dbms_stats.gather_table_stats(ownname=>'GH_CC',tabname=>upper('light_workflow_abstractticket')

The first analysis shows that the column histogram information corresponding to the table is still none, and the default collection histogram information is for all columns size auto:
SQL> select dbms_stats.get_param ('method _ opt') from dual;

DBMS_STATS.GET_PARAM('METHOD_OPT')
--------------------------------------------------------------------------------
FOR ALL COLUMNS SIZE AUTO

Force method_opt => 'for all columns size 254' again. The column has a high histogram and the SQL Execution Plan is normal.

Why is the histogram information of this column cleared, and the application does not have any code to collect or delete statistics for this table.

SQL> SELECT owner, table_name, last_analyzed
2 FROM dba_tables AS OF TIMESTAMP SYSTIMESTAMP - 6 / 24
3 WHERE table_name = UPPER ('light_workflow_abstractticket');

OWNER TABLE_NAME
------------------------------ ------------------------------
LAST_ANALYZED
-------------------
CC_ARC LIGHT_WORKFLOW_ABSTRACTTICKET
2014-07-12 06:10:25

GH_CC LIGHT_WORKFLOW_ABSTRACTTICKET
2014-08-03 14:10:41

SQL> select sysdate from dual;

SYSDATE
-------------------
2014-08-04 15:25:55

Because of the analysis again, the last_analyzed of dba_tables has been rewritten. We found that the last analysis was 14:10:41, the oracle 11g statistical analysis job starts from Monday to Friday at, and starts statistical analysis on Saturday at a.m.. It is inferred that the application is excluded from regularly calling the job or scheduler, most likely, this oracle's built-in statistical analysis job deletes statistics.

The histogram of oracle statistics collection columns is method_opt => 'for all columns size auto'. statistics are collected based on the column data distribution and load.

- REPEAT : Collects histograms only on the columns that already have histograms.
- AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns.
- SKEWONLY : Oracle determines the columns to collect histograms based on the data distribution of the columns.
The default is FOR ALL COLUMNS SIZE AUTO.The default value can be changed using the SET_PARAM Procedure.

Then how to determine the column load is determined based on whether sys. col_usage $ appears

SQL> desc sys.col_usage$;
Name Null? Type
----------------------------------------- -------- ----------------------------
OBJ# NUMBER
INTCOL# NUMBER
EQUALITY_PREDS NUMBER
EQUIJOIN_PREDS NUMBER
NONEQUIJOIN_PREDS NUMBER
RANGE_PREDS NUMBER
LIKE_PREDS NUMBER
NULL_PREDS NUMBER
TIMESTAMP DATE

Oracle's smon will record the filter conditions of the predicates when performing a hard parsing on the SQL statements, and then regularly refresh the information of the columns of these predicates to sys. col_usage $. For details, refer to Liu Da's article about how smon maintains the col_usage $ table.

http://www.oracledatabase12g.com/archives/smon-maintain-col-usage.html

Because the problem does not recur, ainemo cannot view the information load of the sys. col_usage $ column in the corresponding table at the time. Here we can manually simulate and test it:

SQL> create table t_xiaoyu01(id number,name varchar2(10));

Table created.
SQL> insert into t_xiaoyu01 values(1,'op');

1 row created.
SQL> insert into t_xiaoyu01 values(1,'op');

1 row created.

SQL> insert into t_xiaoyu01 values(2,'op');

1 row created.

SQL> commit;

Commit complete.

Here we use the default for all columns size auto for analysis:
SQL> execute dbms_stats.gather_table_stats (ownname => 'sys ', tabname => 't_ XIAOYU01', method_opt => 'for all columns size auto ');

PL/SQL procedure successfully completed.

SQL> select intcol#,timestamp from sys.col_usage$ where obj#=150171;

no rows selected

SQL> select num_buckets,column_name,HISTOGRAM from dba_tab_columns where table_name='T_XIAOYU01';

NUM_BUCKETS COLUMN_NAME HISTOGRAM
----------- ------------------------------ ---------------
1 NAME NONE
1 ID NONE

It is found that the histogram is none, and the information of the above columns does not exist in the col_usage $ table. (It takes nearly 15 minutes for the smon process to refresh the column's load information to the col_usage $ table. You can manually run DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO to refresh the column's load to col_usage $)

Force again to specify for all columns size 254
SQL> execute dbms_stats.gather_table_stats (ownname => 'sys ', tabname => 't_ XIAOYU01', method_opt => 'for all columns size 254 ');

PL/SQL procedure successfully completed.

SQL> select num_buckets,column_name,HISTOGRAM from dba_tab_columns where table_name='T_XIAOYU01';

NUM_BUCKETS COLUMN_NAME HISTOGRAM
----------- ------------------------------ ---------------
1 NAME FREQUENCY
2 ID FREQUENCY

If we find that the histogram information is set to none for all columns size auto analysis again
SQL> execute dbms_stats.gather_table_stats (ownname => 'sys ', tabname => 't_ XIAOYU01', method_opt => 'for all columns size auto ');

PL/SQL procedure successfully completed.

SQL> select num_buckets,column_name,HISTOGRAM from dba_tab_columns where table_name='T_XIAOYU01';

NUM_BUCKETS COLUMN_NAME HISTOGRAM
----------- ------------------------------ ---------------
1 NAME NONE
1 ID NONE

Here, ainemo inferred that this problem may be caused by sys. col_usage $ base table. It may be that smon does not refresh the load information of this column to sys. col_usage $ or sys. col_usage $ the oracle automatic analysis program is modified for some reason to clear the histogram information to none.

I checked some articles about col_usage $ and found that the fixed table is maintained by the smon process, and whether the system has cleared some information of the col_usage $ table, similar articles were not found on mos.

To put it simply, due to the lack of histogram information, the SQL statement is not properly executed. After the histogram information is collected again, the SQL statement selects a reasonable execution plan. Why is the histogram information missing, it is inferred that most of them are caused by the program of the oracle 11g automatic statistics, and the job for automatic analysis will clear the column histogram, this is probably because of the method_opt => for all columns size auto collection method, which is only collected in sys. col_usage $ the column of the table corresponding to the table, and why sys. the load information of the preceding columns in the col_usage $ table may not be found.

To ensure the stable operation of the system, ainemo can only write a stored procedure corresponding to method_opt => 'for all columns size 254' and collect the table every hour.

Here we can imagine a simple cbc latch wait. The hotspot block phenomenon throws a lot of oracle internal knowledge, and this case does not find the root cause of the problem. Is it the histogram cleared by the automatically analyzed program, then, if the histogram is cleared by the automatically analyzed program, most of them are sys. why is the load record of the column corresponding to col_usage $ cleared by oracle.

Due to its limited level, the customer also needs to restore the normal use of the system as soon as possible. This case cannot be further analyzed for the moment. I hope there will be updated progress in this case later, A little progress every day, a little more!

Original article address: A Case Study of regular missing histogram information. Thank you for sharing it with the original author.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.