How to Improve datapump Operation Performance

Source: Internet
Author: User

How to Improve datapump Operation Performance

When performing datapump export and import, do everything possible to improve the performance. Here we will introduce some related DataPump and database parameters that can significantly improve the performance of DataPump.

1. DataPump-related DataPump parameters are affected.
Access_method
In some cases, the method selected by the Data Pump API cannot quickly access your dataset. In this case, in addition to explicitly setting this parameter to test each access method, you cannot know that the access method is more efficient. This parameter has two options: direct_path and external_table.

Cluster = n
In the RAC environment, the basic operation speed of the high Data Pump API can be significantly provided. Note that this parameter only applies to Data Pump API operations. In the RAC environment, we recommend that you set this parameter to n. If you set parallel_force_local to true, this will not only affect Data Pump API operations.

Data_options = disable_append_hint
It is only an impdp parameter. In special cases, it can be used securely and may reduce the time for data import. The data_options = disable_append_hint parameter is used only when all of the following conditions are met.
1. The import operation will import data to an existing table, partition, or subpartition.
2. The number of existing objects to be imported is very small (for example, 10 or smaller)
3. When the import operation is executed, other sessions only execute the select statement for these imported objects.
Data_options = disable_append_hint can be used only in 11.2.0.1 and later versions. You can only use data_option = disable_append_hint to lock objects released by other sessions for a long time to save time.

Estimate
The estimate parameter has two mutually exclusive options: blocks and statistics. When performing the export operation, the blocks method is used to evaluate the dataset size, which is longer than the statistics method. However, the size of the data set evaluated using the blocks method is more accurate than that evaluated using the statistics method. If the size of the exported file is not the most important concern, we recommend that you use estimate = statistics.

Exclude = comment
In some cases, end users do not need annotations corresponding to columns and object types. If the data is ignored, the DataPump operation reduces the execution time.

Exclude = statistics
If you do not need to use the excluded include parameter, excluding and exporting statistics will shorten the entire export operation time. Dbms_stats.gather_database_stats will generate statistical information after the data is imported to the target database. When the DataPump engine and any other RDBMS sessions run concurrently to generate statistics for small tables, the DataPump operation may be hang and indefinite. For DataPump operations that take longer than one hour, you can disable the automatic statistics collection task of the database. To temporarily disable the automatic statistics collection task of 11 GB, The DataPump operation does not compete with the task. The sys user executes the following command:
Exec dbms_auto_task_admin.diable (client_name => 'Auto optimizer stats collect ',
Operation => null, window_name => null );
After the DataPump operation is complete, restart the statistics collection task:
Exec DBMS_AUTO_TASK_ADMIN.ENABLE (client_name => 'Auto optimizer stats collection ', operation => NULL, window_name => NULL );

To temporarily disable a 10 Gb automatic statistics collection task, the DataPump operation does not compete with the task. The sys user executes the following command:
Exec sys. dbms_scheduler.disable ('gather _ STATS_JOB ');
After the DataPump operation is complete, restart the statistics collection task:
Exec sys. dbms_scheduler.enable ('gather _ STATS_JOB ');

Network_link
Using this parameter will effectively limit the concurrency of DataPump APIs. Unless your network throughput and bandwidth are better than those of local devices, using network_link will be much slower than using exported files. For DataPump API performance, it tends to be much slower than dump file operations. network_link is only recommended for use as the last recruit. You can consider using mobile devices or shared devices to store dump files instead of network_link for data migration.

Parallel
If multiple CPUs are used and no CPU binding, disk I/O binding, or memory binding are used, and multiple dump files are not used in the dumpfile parameter, parallel Execution will have a positive impact on performance. If the parallel parameter is set to N, N> 1, we recommend that you set the dumpfile parameter to be smaller than the parallel parameter for better use of parallel execution.

Note that the parallel parameter is the maximum number of Concurrent Data Pump worker processes that DataPump APIs can use. However, DataPump APIs may use less DataPump worker processes than this parameter specifies, depending on the bottleneck in the host environment, basic Data Pump API operations may be faster when the value specified by parallel is less than the number of available CPUs.

Query
Using the query parameter significantly increases the load of any basic DataPump API operations. This overhead is proportional to the data volume of the queried table.

Remap _*
Using any remap _ * parameter will significantly increase the load of any basic DataPump API operations. This overhead is proportional to the data volume of the queried table.

2. Database parameters that affect the performance of DataPump operations
Aq_tm_processes = 0
When this parameter is explicitly set to 0, it may have a negative impact on the operation of the advanced queue, which in turn has a negative impact on the basic operations of DataPump using the advanced queue. You can restore this parameter or set a value greater than 0.

Deferred_segment_creation = true
Only applicable to import operations. This will eliminate the time spent allocating space for empty tables. Setting this parameter for the export operation will not significantly affect the performance. This parameter is useful in 11.2.0.2 or later versions.

Filesystemio_option =...
Under certain circumstances, the database instance will perform write operations on the ACFS file system, specifying the write operation type of the Data Pump API as part of the export operation, values other than NONE may cause the export operation to slow down.

NLS_CHARACTERSET =... and NLS_NCHAR_CHARACTERSET =...
When there are differences between the two parameters of the source database and the target database, when performing the import operation at any time, the specified partition table cannot use multiple DataPump working processes to create partition tables and fill. In some cases, only one DataPump worker can perform operations on the table data, which will obtain an exclusive lock on the table to prevent any other DataPump worker from performing operations on the same table. When the partition table does not have an exclusive lock, you can use multiple DataPump worker processes to operate at the same time to significantly improve the performance of importing data to the partition table.

NLS_COMP =... and NLS_SORT =...
In some rare cases, the two parameters of the database are set to binary, which significantly increases the speed of basic operations on the DataPump API. Test whether these two parameters are set to binary in your environment to improve performance. After a session is logged on, you can set these two parameters at the session level through the following login trigger.
Create or replace trigger sys. expdp_nls_session_settings AFTER LOGON ON DATABASE
DECLARE
V_MODULE VARCHAR2 (60 );
BEGIN
SELECT SYS_CONTEXT ('userenv', 'module') INTO V_MODULE from dual;
If upper (V_MODULE) LIKE 'ude %'
THEN
BEGIN
Execute immediate 'alter session set NLS_COMP = ''binary ''';
Execute immediate 'alter session set NLS_SORT = ''binary ''';
END;
End if;
END;
/

Parallel_force_local = true
In the RAC environment, you can significantly improve the performance of basic DataPump API operations and avoid the bug of parallel DML operations. However, this parameter can only be used for version 11.2.0.2 or later.

Streams_pool_size
To avoid bug 17365043 'streams AQ: enqueue blocked on low memory when compaction cing STREAMS_POOL_SIZE'
We recommend that you set streams_pool_size to the result value returned by the following query:
Select 'alter system set STREAMS_POOL_SIZE = '| (max (to_number (trim (c. ksppstvl) + 67108864) | 'scope = SPFILE ;'
From sys. x $ ksppi a, sys. x $ ksppcv B, sys. x $ ksppsv c
Where a. indx = B. indx and a. indx = c. indx and lower (a. ksppinm) in ('_ streams_pool_size', 'streams _ pool_size ');

_ Memory_broker_stat_interval = 999
If the resize operation consumes a lot of time in your slow DataPump environment, setting this parameter will reduce the frequency of the resize operation, in this way, the latency of resize operations is reduced within a specified time span. This is because the DataPump API relies on a large number of stream functions to help export and import operations. We recommend that you set this parameter to 999 if streams_pool_size has been explicitly set and the resize operation frequently occurs.

Iii. Table DDL-level parameters that affect DataPump Performance
Network_link + securefiles
When the network_link parameter moves a table that contains a lob column, and the lob is used to use securefiles, the moving operation is very slow, when the network_link parameter is used to move a table that uses securefiles and has a lob column, a large amount of undo data is generated. The reason is that distributed transaction allocation requests are limited to cross-database links with only one data block at a time, which means that larger data sets will be transmitted.

Securefiles (network_link is not used)
Use the securefiles storage format to store data in the LOB column. Allow tables containing the lob column to be exported and imported in parallel.
Use the basicfiles storage format to store data in the LOB column. tables that contain the lob Column cannot be exported or imported in parallel.

Iv. Table DML-level parameters that affect DataPump Performance
Competition arises between DataPump operations and sessions accessing another database object (usually the lock on the table and row data)
When the DataPump engine executes the export operation, it will wait for other sessions to release the row lock and table lock held by it, and then execute the export and import of relevant tables. The DataPump engine will wait for the row lock and table lock held by other sessions to be released before performing the export operation while the typical export tool does not. Therefore, exporting a table that is frequently updated is slower than exporting a table that is not currently updated.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.