The index clustering factor is a key statistic that can improveboth the Oracle optimizer's performance, and the technician'sunderstanding of the utility of an index.
It is used by Oracle's optimizer to help determine the costassociated with index range scans in comparison to full tablescans.
Calculating the Clustering Factor
To calculate the clustering factor of an index during the gatheringof index statistics, Oracle does the following.
For each entry in the index Oracle compares the entry's table rowidblock with the block of the previous index entry.
If the block is different, Oracle increments the clustering factorby 1.
The minimum possible clustering factor is equal to the number ofblocks identified through the index's list of rowid's -- for anindex on a column or columns conatining no nulls, this will beequal to the number of blocks in the table that contain data. Themaximum
clustering factor is the number of entries in theindex.
Interpreting the Clustering Factor
So this means that Oracle now has a statistic to allow it toestimate how many table blocks would be associated with an indexrange scan.
If the clustering factor is close to the number of entries in theindex, then an index range scan of 1000 index entries may requirenearly 1000 blocks to be read from the table.
If the clustering factor is close to the number of blocks in thetable, then an index range scan of 1000 index entries may requireonly 50 blocks to be read from the table.
This can be compared with the cost of reading the entire table upto the high-water mark (using the more efficient multiblock i/omechanism) to determine whether a full table scan or an index rangescan offers the most efficient access mechanism.
Note that where extensive deletes have occurred from the tablethere may be blocks with no rows in. These will be accounted for inthe clustering factor because those blocks will not appear in theindex's rowid list. The full table scan will still read all tableblocks
up to the high water mark, regardless of whether theycontain rows or not. So in an extreme case it is possible thatOracle could see from the index and table statistics that althougha table has 1,000,000 blocks below the high water mark, reading100% of the
rows in the table might only require reading 10 ofthose blocks. Providing that "Not Null" constraints tell Oraclethat all table rows are present in the index, a query such as"select * from big_table_with_few_rows" might be more efficientlysatisfied with an
index range scan than with a full tablescan.
Note also that calculating the clustering factor is done withoutreference to the table at all -- it is based solely on informationcontained in the index.