In-depth analysis of Oracle direct path loading-append

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

(I) direct path loading and buffer cache

Data inserted in the direct path is directly formatted as an oracle block from PGA without passing through the buffer cache.
Then, a common ORACLE Server process writes data blocks to data files.
Dbwn is not required because it does not pass through the buffer cache.

For example, if Table A is available, you need to insert data in Table B into Table A. Under normal insertion, You need to first Insert the data block I/O of Table A to the buffer cache.
Read from Block A in buffer cache and insert it into block B.
At this time, the blocks of B are changed to dirty blocks, and then wait for dbwn to flush them to the data file.
Therefore, after normal inserts, the blocks of table A and table B will appear in the buffer cache.

Directly Insert the data block I/O of Table A to the buffer cache, read the trip, and directly write it into the data file of Table B.
After the insertion is complete, except the header block, the data block of Table B does not appear in the buffer cache.

Test:

HR @ orcl> Create Table A (ID number, name varchar2 (10); table created. HR @ orcl> Create Table B (ID number, name varchar2 (10); table created. HR @ orcl> insert into a values (1, 'A'); 1 row created. HR @ orcl> insert into a values (2, 'bb'); 1 row created. HR @ orcl> insert into a values (3, 'cc'); 1 row created. HR @ orcl> insert into a values (4, 'dd'); 1 row created. HR @ orcl> commit; Commit complete. HR @ orcl> select transform (rowid), dbms_rowid.rowid_block_number (rowid) from a; dbms_rowid.rowid_relative_fno (rowid) dbms_rowid.rowid_block_number (rowid) ------------------------------------ 4 508 4 508 4 508 -- table A has four rows, occupying 508 of the block, currently, table B still has data-Clear the buffer cache HR @ orcl> alter system flush buffer_cache; System altered. -- insert data from Table A to table B using a direct path. HR @ orcl> insert/* + append */into B select * From A; 4 rows created. HR @ orcl> commit; Commit complete. -- use v $ BH to view the block HR @ orcl> select file #, block # from V $ BH where objd = (select data_object_id from user_objects where object_name = 'A '); file # block # ---------- 4 508 blocks contains the data currently 4 508 blocks contain the data currently 4 511 4 511 4 506 4 509 4 509 4 512 4 512 4 507 4 4 4 4 4 507 4 510 4 510 4 50514 rows selected. -- This is because a full table scan is performed on Table A. All the blocks at the low and high levels in Table A are read into the buffer cache, of course, this includes 508 HR @ orcl> select file #, block # from V $ BH where objd = (select data_object_id from user_objects where object_name = 'B '); file # block # ---------- 4 2571 4 2569 4 2570hr @ orcl> select partition (rowid), dbms_rowid.rowid_block_number (rowid) from B; partition (rowid) dbms_rowid.rowid_block_number (rowid) ------------------------------------ ---------------------------------- 4 2572 4 2572 4 2572 4 2572 -- the preceding two Queries show that the data in Table B occupies 2572 blocks. However, after directly inserting the data in the path, 2572 is not transferred to buffer cache -- buffer cache contains only 2569 2570 2571 -- where 2571 is the segment header block (select header_file, header_block from dba_segments where segment_name = 'B ') -- 2570 2569 is the two bitmap blocks L1 L2 -- then use the normal insert HR @ orcl> alter system flush buffer_cache; System altered. HR @ orcl> insert into B select * From A; 4 rows created. HR @ orcl> commit; Commit complete. HR @ orcl> select file #, block # from V $ BH where objd = (select data_object_id from user_objects where object_name = 'B '); file # block # ---------- 4 2571 4 2574 4 2569 4 2575 4 2570 4 2570 4 2573 4 2576 rows contains the 8 rows selected block where the common inserted data is located. HR @ orcl> select transform (rowid), dbms_rowid.rowid_block_number (rowid) from B; dbms_rowid.rowid_relative_fno (rowid) dbms_rowid.rowid_block_number (rowid) ------------------------------------ 4 2572 4 2572 4 2572 4 2572 4 2576 4 2576 4 2576 4 25768 4 rows selected.

As shown in the above experiment, normal insertion requires data blocks to be first transmitted to the buffer cache.
This is a common way for Oracle to modify data. Instead of directly modifying the data file, the data file is modified in the memory, which is then protected by logs.
This is a desirable method for small changes, but direct paths can provide better performance for Big Data transactions.

In addition, direct path loading is the insertion action completed above the high water level. Therefore, no matter how many idle blocks under the high water level are ignored, the section space will increase accordingly.

(Ii) direct path loading and undo

The obvious difference is that undo (the direct path almost does not have UNDO), and the two Redo are similar (the normal insert is a little more Redo, because Oracle needs redo to protect UNDO)
Direct path rollback with hwm, normal insertion with undo rollback

-- Insert HR @ orcl> insert/* + append */into B select ID + 4, name from a; 4 rows created to table B again. -- View transaction information HR @ orcl> select xidusn, xidslot, xidsqn, ubafil, ubablk, ubasqn from V $ transaction; xidusn xidslot xidsqn ubafil ubablk ubasqn ---------- 10 23 314 0 0 0 -- because there is only one transaction currently, therefore, when you select the V $ Transaction View, no conditions are added. The preceding result shows that ubafil ubablk is 0, that is, there is no corresponding rollback block for this transaction -- only one row is occupied in the transaction table of the rollback segment header.

So how does direct path insertion provide rollback? You can solve this problem by observing the hwm changes in table B.

-- Find hwmhr @ orcl> select header_file, header_block from dba_segments where segment_name = 'B' in Table B '; header_file header_block ----------- ---------- 4 2571hr @ orcl> alter session set tracefile_identifier = 'hr _ 2571 '; Session altered. HR @ orcl> alter system dump datafile 4 block 2571; System altered. -- The TRC file is extracted as follows: extent control header extends extent header: spare1: 0 spare2: 0 # extents: 2 # Blocks: 16 last map 0x00000000 # maps: 0 offset: 2716 highwater: 0x01000a11 ext #: 0 BLK #: 8 ext size: 8 # Blocks in seg. HDR's freelists: 0 # blocks below: 8 mapblk 0x00000000 offset: 0 unlocked slow low highwater MARK: highwater: 0x01000a11 ext #: 0 BLK #: 8 ext size: 8 -- the high water level point is file no. 4 block No. 2577 -- after submission, check which directory the direct path is inserted to HR @ orcl> select dbms_rowid.rowid_relative_fno (rowid), dbms_rowid.rowid_block_number (rowid) from B where ID> = 8; dbms_rowid.rowid_relative_fno (rowid) dbms_rowid.rowid_block_number (rowid) limit 4 2577 -- through the above query, this direct path is inserted, and data is saved in block 2577, the submitted hwm is exactly 2577 | -------------- | -------- | ---------- | data block ...... | 2575 | 2576 | 2577 | 2578 |... | -------------- | -------- | ---------- high water level point. The direct path is inserted to allocate space from this block. The direct path is inserted to allocate a temporary segment above the high water level point, insert data into this temporary segment-raise the high level point to the temporary segment after submission-it has already been submitted, let's take a look at the high water level information HR @ orcl> alter system dump datafile 4 block 2571; System altered. -- The TRC file is extracted as follows: extent control header extends extent header: spare1: 0 spare2: 0 # extents: 2 # Blocks: 16 last map 0x00000000 # maps: 0 offset: 2716 highwater: 0x01000a12 ext #: 1 BLK #: 1 ext size: 8 # Blocks in seg. HDR's freelists: 0 # blocks below: 9 mapblk 0x00000000 offset: 1 unlocked slow low highwater MARK: highwater: 0x01000a12 ext #: 1 BLK #: 1 ext size: 8 -- the first dump is highwater: 0x01000a11, but now it is highwater: 0x01000a12 -- the high water level is increased to 2578, for example, | -------------- | -------- | ----- | data block ...... | 2575 | 2576 | 2577 | 2578 | -------------- | -------- | ----- when the high water level is reached here -- try again when inserting the rollback directly next, this insert should be inserted to 2578. If submitted, the height point will be increased to 2579. If rolled back, HR @ orcl> insert/* + append */into B select ID + 4, name from a; 4 rows created. HR @ orcl> commit; Commit complete. -- TRC Abstract: extent control header extends extent header: spare1: 0 spare2: 0 # extents: 2 # Blocks: 16 last map 0x00000000 # maps: 0 offset: 2716 highwater: 0x01000a13 ext #: 1 BLK #: 2 ext size: 8 # Blocks in seg. HDR's freelists: 0 # blocks below: 10 mapblk 0x00000000 offset: 1 Disk lock: locked by Xid: 0x0006. 012.0000018d slow low highwater MARK: highwater: 0x01000a13 ext #: 1 BLK #: 2 ext size: 8 -- the high water level point is changed to 0x01000a13, Which is 2579 blocks-the next step is rollback, rollback will keep the high water level unchanged, that is, in the first 2579 HR @ orcl> insert/* + append */into B select ID + 12, name from; 4 rows created. HR @ orcl> rollback; rollback complete. HR @ orcl> alter system dump datafile 4 block 2571; System altered. -- The TRC file is extracted as follows: extent control header extends extent header: spare1: 0 spare2: 0 # extents: 2 # Blocks: 16 last map 0x00000000 # maps: 0 offset: 2716 highwater: 0x01000a13 ext #: 1 BLK #: 2 ext size: 8 # Blocks in seg. HDR's freelists: 0 # blocks below: 10 mapblk 0x00000000 offset: 1 unlocked lower highwater MARK: highwater: 0x01000a13 ext #: 1 BLK #: 2 ext size: 8 -- 0x01000a13, that is, 2579 blocks, rollback, and the high level points remain unchanged.

In the process of executing direct path loading, the high water level does not really increase. This action is completed only after the transaction is committed, and the table can be accessed only after all maintenance work is completed.
Therefore, if you want to query this table before submitting the table, you can see that addition, deletion, modification, and merge operations on the table are not allowed.

(Iii) direct path loading and Index

When inserting a direct path, the rollback information of the table block is not generated, but the rollback is implemented based on the high water level point.
However, if the table has an index, the index rollback information will be generated, and the index block will be read into the buffer cache.
Test:

-- Create an index HR @ orcl> Create index idx_ B on B (ID); index created for table B. HR @ orcl> select file #, block # from V $ BH where objd = (select data_object_id from user_objects where object_name = 'idx _ B '); file # block # ---------- 4 2587 bytes when the first block of the segment 4 2585 bytes when L1 block 4 2588 bytes when the first index data block 4 2586 bytes when the L2 block-restart the database, clear buffer cachehr @ orcl> select file #, block # from V $ BH where objd = (select data_object_id from user_objects where object_name = 'idx _ B '); no rows selectedhr @ orcl> insert/* + append */into B select * From A; 4 rows created. HR @ orcl> select file #, block # from V $ BH where objd = (select data_object_id from user_objects where object_name = 'idx _ B '); file # block # ---------- 4 2588 -- when the direct path is inserted, the index block will still be transferred to buffer cachehr @ orcl> select xidusn, xidslot, xidsqn, ubafil, ubablk, returns from V $ transaction; xidusn xidslot xidsqn ubafil ubablk ubasqn ---------- ------------ ---------- 6 41 399 2 1456 -- and changes to the index block will generate rollback information, the rollback information is stored at 1456 of the rollback block. Therefore, the index does not "insert directly". Therefore, the inserted index data should be at a high level: HR @ orcl> select header_file, header_block from region where segment_name = 'idx _ B '; header_file header_block ------------------ 4 2587hr @ orcl> alter system dump datafile 4 block 2587; System altered. -- The TRC file is extracted as follows: extent control header extends extent header: spare1: 0 spare2: 0 # extents: 1 # Blocks: 8 last map 0x00000000 # maps: 0 offset: 2716 highwater: 0x01000a1d ext #: 0 BLK #: 4 ext size: 8 # Blocks in seg. HDR's freelists: 0 # blocks below: 1 mapblk 0x00000000 offset: 0 unlocked slow low highwater MARK: highwater: 0x01000a1d ext #: 0 BLK #: 4 ext size: 8 -- the high water level points at 2589 blocks, and the inserted index data is at 2588 points.

Oracle official documentation suggests that if you use direct path insertion to transfer a large amount of data to a table, you can delete the index in the table first, and then re-create the index after the insertion.

(Iv) direct path loading and some restrictions

Note the following when using the direct path loading method:
1) the direct path loading method is not supported by all insert methods. The most common insert statement with a value clause is not supported.
2) This technique is most commonly used in insert statements of the insert into... select... structure.
3) when using direct path loading technology to insert data until the transaction is committed, other operations such as adding, deleting, modifying, querying, and merge are prohibited
4) because it is a direct path load, idle database blocks below the high level will not be used, which may lead to unlimited expansion of data segments.
5) when an insert trigger, foreign key, Iot, or table exists on the operated table, clustering technology is used, and the table contains the lob field
The direct path loading technology is invalid and will be automatically converted to regular insert

(V) Summary

The reason why direct path loading technology can improve performance is that this method can minimize the generation of rollback data during data loading.
Expdp/impdp does not support direct paths, but sqlldr (specified by the parameter direct = true and parallel = true at the same time, the speed will be faster)
If we can maximize this method after balancing the pros and cons, it is inevitable to increase the loading speed!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

In-depth analysis of Oracle direct path loading-append

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

In-depth analysis of Oracle direct path loading-append

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support