Oracle Performance Analysis 4: Full scan of data access methods

Last Update:2014-09-01 Source: Internet

Author: User

Tags compact

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When SQL statements are executed, the Oracle optimizer determines how tables are accessed based on statistics, and generally, there are two basic ways to access data: 1) Full scan. In a full scan (full-table scan or fast full-index scan), multiple blocks are read into an IO operation. 2) Index Scan. Index scans first scan the index leaf blocks to obtain a specific row ID (ROWID), and then use these row IDs to access the parent table to get the actual row data, accessed through a single block of reads. The full scan method is explained here, and the index scan is introduced later.

Use full scan

When a table is fully scanned, all blocks of data in the table are taken out and processed to filter out the qualifying data. Note that Oracle must read the data in the entire block of data into memory before obtaining the qualifying data. So Oracle's optimizer needs to be concerned about two pieces of information: getting the number of blocks and the amount of data discarded in each block. The optimizer will determine whether to use the full scan based on these two information, first let's see how the number of fetch blocks affects the optimizer's choice.

Get the number of blocks

In general, a full scan should be used if the query needs to take out most of the data blocks of the table. However, since it is difficult to assess the number of data blocks for the table that the query will take out, there are many such "rules of thumb" in using a full scan: When your query takes X-percent rows from the table, you should select Full Scan. These laws have some reason, but not accurate, because when the data row is larger, naturally take out the data block will be larger, when the use of full scanning is not a problem, but sometimes although the data rows taken out smaller, will be taken out of the data block may be larger, in fact, should also use a full scan, but these "rules of thumb" is no longer valid. Let's take a look at the specific examples below.
We create a table T1:

CREATE table T1 as select Trunc ((rownum-1)/ID, rownum value from  dba_source where rownum <= 10000

Then create an index for T1:

CREATE index idx_t1_id on T1 (ID)

Then collect statistics for T1:

BEGIN  dbms_stats.gather_table_stats (user,                                ' t1 ',                                method_opt = ' for all COLUMNS SIZE 1 ',                                Cascade    = TRUE); END;

Then we execute the query:

SELECT * FROM t1 where id = 0

The execution plan for the query is as follows:

SELECT STATEMENT, GOAL = all_rows TABLE ACCESS by index ROWID  index RANGE SCAN

The execution plan uses an index range scan, because the data that matches the condition ID 0 has only 100 rows of data in the table, and the entire table has 10,000 rows of data, and the data is only 1% of the entire data, so we think this is a reasonable execution plan.
Next we look at the following example to create a table T2:

CREATE TABLE t2 as select mod (rownum,100) ID, rownum value from  dba_source where rownum <= 10000

Also create an index for T2:

Create index idx_t2_id on T2 (ID)

Then collect statistics for T2:

BEGIN  dbms_stats.gather_table_stats (user,                                ' T2 ',                                method_opt = ' for all COLUMNS SIZE 1 ',                                Cascade    = TRUE); END;

Then execute the query:

SELECT * FROM t2 where id = 0

The execution plan for the query is as follows:

SELECT STATEMENT, GOAL = all_rows TABLE ACCESS full

We see the table execution plan into a full table scan, we can easily get the results of the query is still 100 data, accounting for the total data of T2 table 1%, if we further compare the data of T2 and T1, we will find the ID field of the two tables exactly the same, then why T1 chose the index Scan, And T2 chose a full-table scan?
To understand the reason, we need to analyze the distribution of the data on the data block, in the T1 table, the ID field is distributed as follows:

0 0...0 0 1 1...1 1 2 2...2 2......88 88...88 88......99 99...99 99

The ID field in the T2 table is distributed as follows:

0 1 2 3 ... 98 99 0 1 2 3 ... 98 99 ... 0 1 2 3 ... 98 99

It can be seen from here that the data with ID 0 in the T1 table is concentrated on several blocks, while the data with ID 0 in the T2 table is distributed on many different blocks, which results in T1 queries requiring only a few blocks to be read, so the index range scan is used, and the query on T2 needs to read most blocks. Therefore, the optimizer selects a full table scan.

Abandon

It is important to note that the efficiency of the full scan depends not only on the number of data blocks being read, but also on the number of final result set rows. As we can see from the above example, when a block is read, the query discards the non-conforming data based on the filtering criteria. The process of abandonment is resource-intensive, as this operation is in memory, so it consumes CPU resources, and the larger the amount of data discarded, the more CPU resources are consumed.
Therefore, the more data blocks are read, the larger the amount of data discarded, the higher the cost of the full scan.
It is not difficult to imagine that when the data volume of the table is increasing, the number of discarded rows is increasing, the cost of the full scan is increasing, which may eventually cause the optimizer to abandon the full scan and choose the index scan instead.

Multi-block Read mode

Multi-scan uses multiple blocks of read, that is, a single IO call will read multiple blocks, the number of blocks read is variable, but there is an upper limit, specified by the Db_file_multiblock_read_count parameter, which is viewed through the following SQL:

SELECT * from v$parameter where name = ' Db_file_multiblock_read_count '

The following describes the number of blocks that Oracle reads in several cases:

1) Oracle has to read data blocks that exceed a certain boundary range.
In this case, Oracle will read the block of data in a single call until the bounds of the boundary, and then initiate another invocation to read the remaining blocks.
2) the presence block is already in memory
First read the block that is already in memory, and then initiate the call to read the remainder of the block, which means that multiple reads may read only one block at a time. For example, suppose that the upper limit for multiple reads is 16, the block number of the read is 1-16, and the blocks numbered even are already in memory, in this case, there will be 8 times of a single read tune to read odd numbered blocks.
3) Multi-block read size exceeds operating system limit
This is dependent on your operating system and is therefore mutable.

High water Mark Line

The so-called high watermark is the last chunk of data written in the table. It is important to note that even though almost all rows of data have been deleted, and some blocks have actually completely become empty, the high watermark remains the same. Look at the following example, when the table is created and the data is inserted:

As the data changes (deleted and modified) later, the data in the table changes to:

Although many storage areas already have no data, the high watermark remains the same.
So what is the impact of the high watermark on the full scan?
When performing a full scan, Oracle will always read the data blocks that are in the high watermark in the table, even if they are empty, which means that many blocks of data that do not actually need to be read are also read.

The following is a concrete example of using the previous table T2.

1) Determine the number of data blocks that the table contains by using the following statement:

Select blocks from user_segments where segment_name = ' T2 ' Result: 24

2) determine how many data blocks in the table contain data;

Select COUNT (Distinct (Dbms_rowid.rowid_block_number (ROWID)) block_ct from T2 results: 17

3) Execute the following query and view the trace information (see Oracle Performance Analysis 1 for information about trace)

alter system flush buffer_cache;--Clean Cache select * FROM t2 where id = 0

The trace information is:

Call     count       CPU    elapsed       Disk      query    current        rows-------------  ------------------ ------------------------------  ----------Parse        1      0.00       0.00          0          0          0           0Execute      1      0.00       0.00          0          0          0           0Fetch        2      0.00       0.06          0         -------------  ------------------------------------------------  ----------Total        4      0.00       0.07          0 100Misses in         Library cache during Parse:1optimizer Mode:all_ rowsparsing user Id:5  Rows     Row Source operation-------  -------------------------------------------- -------  TABLE ACCESS full T2 (cr=20 pr=18 pw=0 time=35975 US)

Query out 100 rows of data, the number of physically read blocks (disk) is 18, including a header data block read (only 17 data blocks contain data). The execution plan uses a full table scan.

4) Perform the delete data operation

Delete from  T2

5) Retrieve the number of data blocks contained in the table

Select blocks from user_segments where segment_name = ' T2 ' Result: 24

6) Get the number of data blocks that contain data

Select COUNT (Distinct (Dbms_rowid.rowid_block_number (ROWID)) block_ct from T2 results: 0

7) Execute the query and view the trace information

alter system flush buffer_cache;--Clean Cache select * FROM t2 where id = 0

The trace information is:

Call     count       CPU    elapsed       Disk      query    current        rows-------------  ------------------ ------------------------------  ----------Parse        1      0.00       0.00          0          0          0           0Execute      1      0.00       0.00          0          0          0           0Fetch        1      0.00       0.21          0           0-------------  ------------------------------------------------  ----------        Total 3      0.00       0.22          0 0Misses in           Library cache during Parse:1optimizer Mode:all_ rowsparsing user Id:5  Rows     Row Source operation-------  -------------------------------------------- -------      0  TABLE ACCESS full T2 (cr=20 pr=18 pw=0 time=214806 US)

We can see that the number of rows queried is 0, but still physically reads 18 blocks of data, and the execution plan uses a full scan.

Fixed high watermark

We have already understood the performance problems caused by high watermark to full scan, and some methods to reduce the high watermark are described below.

Using the TRUNCATE operation

TRUNCATE TABLE_NAME

Use truncate as much as possible when deleting data to reduce the high watermark.

Move action

ALTER TABLE TABLE_NAME MOVE

Note that the move operation requires additional tablespace storage, which locks the table so that other concurrent user-executed DML statements on the table will produce a wait. The move operation affects the index on the table, so the index needs to be rebuild.

Shrink operation

The shrink space operation does not require any additional room, but is much slower than the move. The shrink command is divided into the following two types:

1) Only compressed space does not adjust the watermark, can be executed when the business is busy

ALTER TABLE TABLE_NAME SHRINK SPACE Compact

The compact operation arranges the data in front of the segment as much as possible through a series of INSERT, delete operations. In this process, you need to add the RX lock on the table, that is, only the lines that need to be moved are locked. However, because of the ROWID changes involved, the Enable row movement is required.

2) Adjust the watermark to generate a lock, can be executed when the business is relatively small, Oracle will remember the operation in 1 steps, only adjust the watermark

ALTER TABLE big_table shrink space

Working with new tables

Copy the data to be persisted to the temporary table t,drop the original table, and then rename the temporary table T as the original table.

Oracle Performance Analysis 4: Full scan of data access methods

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More