Oracle Database 10g: Best New Features (week fourth: high-speed export/import)

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Oracle
Week 4th
High-speed Export/import: Oracle Data Pump

Data movement with Oracle Database 10g utility has been greatly improved.

To date, the export/import toolset is still the least labor-intensive preferred utility for transferring data across multiple platforms, although people often complain about its slow speed. It is not surprising that imports simply read each record from the export dump file and then insert it into the destination table using the common INSERT INTO command, so importing can be a slow process.

An updated, similar tool that enters the export/import toolkit in Oracle Data pump,oracle Database 10g is designed to multiply this process.

The Data Pump reflects a thorough overhaul of the entire export/import process. Instead of using common SQL commands, it applies a proprietary API to load and unload data at a much faster rate. In my tests, I saw that the export performance was 10-15 times higher than in direct mode, and the import process was 5 times times more performance. In addition, unlike using an export utility, it can also remove only certain types of objects, such as procedures.

Data Pump Export

This new utility, called EXPDP, is separated from the original export exp zone. In this example, we will use Data Pump to export a large table CASES, which is about 3GB in size. The Data Pump uses file processing on the server side to create and read files, so the directory is used as a location. In this case, we will use the file system/u02/dpdata1 to save the dump file.

Create directory dpdata1 as '/u02/dpdata1 '; Grant read, write on directory dpdata1 to Ananda;

Next, we'll export the data:

EXPDP ananda/abc123 tables=cases directory=dpdata1 dumpfile=expcases.dmp job_name=cases_export

Let's analyze the various parts of the command. The significance of user Id/password combinations, tables, and dump file parameters is obvious. Unlike the original export, the file was created on the server (not the client). The location is specified by the table of contents parameter value DPDATA1, which points to the previously created/u02/dpdata1. This process also creates a log file (also on the server) at the location specified by the directory parameter. By default, this process uses a directory called Dpump_dir, so you can create it instead of DPDATA1.

Note that the above parameter job_name, which is a special parameter, is not in the original export. All Data Pump work is done through the job. The data Pump job-unlike the DBMS job-is just the server process, which processes data on behalf of the main process. The main process (called the primary control process) coordinates this work through the Advanced Queue (AQ), which is accomplished by a special table created during the run period, called the main table. In our example, if you check the user Ananda pattern at run time, you will notice the presence of a table cases_export (corresponding to the EXPDP parameter job_name). When the EXPDP is over, the table is discarded.

Export monitoring

When the Data Pump export (DPE) is run, press CONTROL-C, which blocks the message from appearing on the screen, but does not stop the export process itself. Instead, it displays the DPE prompt (shown below). The process is now considered to be in interactive mode:

Export>

This method allows you to enter several commands on this DPE job. To view the summary, use the STATUS command at the prompt:

export> StatusJob:CASES_EXPORTOperation:EXPORT mode:table state:executing degree:1job Error count:0dump Dpdata1/expcases.dmp bytes written = 2048Worker 1 Status:State:EXECUTING Object schema:dwownerobject name:casesobject type:table_export/tbl_table_data/table/table_datacompleted objects:1total objects:1completed rows:4687818

Remember, this is just a status display. Export work in the background. To continue viewing messages on the screen, use the command continue_client from the export> prompt.

Parallel operations

You can use more than one thread for export by using the PARALLEL parameter to significantly speed up the job. Each thread creates a separate dump file, so the parameter dumpfile should have as many items as the degree of parallelism. Instead of explicitly entering individual file names, you can specify wildcard characters as file names, such as:

EXPDP ananda/abc123 tables=cases directory=dpdata1 dumpfile=expcases_%u.dmp parallel=4

Note The DumpFile parameter has a wildcard%u that indicates that the file will be created as needed, and the format will be expcases_nn.dmp, where nn starts at 01 and then increases as needed.

In parallel mode, the status screen displays four worker processes. (in default mode, only one process is visible.) All worker processes take out data synchronously and display their progress on the status screen.

It is important to detach the input/output channels that access the data file and dump the directory file system. Otherwise, the overhead associated with maintaining the Data Pump job may outweigh the benefits of parallel threading and thus degrade performance. Parallelism is only valid if the number of tables is greater than the parallel value and the table is large.

Database monitoring

You can also obtain more information about the running data Pump job from the database view. The main view of the monitoring job is dba_datapump_jobs, which tells you how many worker processes (column DEGREE) are working on the job. Another important view is dba_datapump_sessions, which, when combined with the above view and V$session, gives the session SID of the main foreground process.

Select Sid, Serial#from v$session s, dba_datapump_sessions dwhere s.saddr = d.saddr;

This instruction displays the session of the foreground process. More useful information can be obtained from the alert log. When the process starts, the MCP and worker processes appear in the alert log as follows:

Kupprdp:master process DM00 started with pid=23, OS id=20530 to Execute-sys. KUPM$MCP. MAIN (' Cases_export ', ' Ananda ');
Kupprdp:worker process DW01 started with worker id=1, pid=24, OS id=20532 to Execute-sys. Kupw$worker. MAIN (' Cases_export ', ' Ananda ');
Kupprdp:worker process DW03 started with worker id=2, pid=25, OS id=20534 to Execute-sys. Kupw$worker. MAIN (' Cases_export ', ' Ananda ');

It shows the PID of the session initiated for the data pump operation. You can find the actual SID using the following query:

Select SID, program from V$session where paddr in (select addr to V$process where PID in (23,24,25));

The program column displays the process DM (the main process) or DW (for the worker process) in the corresponding alert log file name. If a worker process uses a parallel query, such as SID 23, you can see it in the view v$px_session and find it. It will show you all parallel query sessions running from the worker process represented by SID 23:

Select Sid from V$px_session where qcsid = 23;

Additional useful information can be obtained from the view V$session_longops to predict the time it will take to complete the job.

Select Sid, Serial#, Sofar, totalworkfrom v$session_longopswhere opname = ' cases_export ' and Sofar!=;

The column totalwork shows the total amount of work, and the number of SOFAR in the column is added to the current moment-so you can use it to estimate how long it will take.

Data Pump Import

However, data import performance is a really good place to Pump. To import previously exported data, we will use the

IMPDP ananda/abc123 directory=dpdata1 dumpfile=expcases.dmp job_name=cases_import

The default behavior of the import process is to create a table and all related objects, and then generate an error when the table already exists. If you want to add data to an existing table, you can use table_exists_action=append on the command line above.

As with the Data Pump import, pressing CONTROL-C in the process will enter the interactive mode of the Date Pump import (DPI), and the prompt is import>.

Working with specific objects

Have you ever had to export a specific process from one user to recreate the procedures in a different database or user? Unlike traditional export utilities, Data Pump allows you to export only certain types of objects. For example, the following command lets you export only the procedure without exporting anything else-tables, views, and even functions are not exported:

EXPDP Ananda/iclaim directory=dpdata1 dumpfile=expprocs.dmp include=procedure

To export only certain objects-for example, function FUNC1 and process proc1-you can use the

EXPDP ananda/iclaim directory=dpdata1 dumpfile=expprocs.dmp include=procedure:\ "=\ ' PROC1\ '", FUNCTION:\ "=\ ' FUNC1\ '" \"

The dump file acts as a backup of the source object. You can even use it to create DDL scripts for later use. A special parameter called Sqlfile allows the creation of DDL script files.

IMPDP Ananda/iclaim directory=dpdata1 dumpfile=expprocs.dmp sqlfile=procs.sql

This directive creates a file named Procs.sql in the directory specified by DPDATA1 and includes the object's script in the export dump file. This approach helps you quickly create source objects in another schema.

Using the parameter include allows you to define the objects to include or exclude from the dump file. You can use clause include=table: "Like ' tab%" to export only those tables whose names begin with the TAB key. Similarly, you can use the struct include=table: "Not like ' tab% '" to exclude all tables whose names begin with the TAB key. As an alternative, you can use the EXCLUDE parameter to exclude specific objects.

With the external table, Data Pump can also be used to transfer table spaces; it's very powerful, it can redefine the parallel way, add more tables to an existing process, and so on (this is beyond the scope of this article; For more details, refer to Oracle Database Utilities 10g Release 1 10.1). The following command displays a list of all the parameters provided by the Data Pump Export utility:

EXPDP help=y

Similarly, IMPDP help=y displays all the parameters in the DPI.

When the Data Pump job is running, you can pause them by issuing stop_job at the DPE or DPI prompt, and then use Start_job to restart them. This feature is handy when you don't have enough space and want to make changes before you continue.

For more information, please read the 1th part of the Oracle Database Utilities 10g Release 1 10.1 guide.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Oracle Database 10g: Best New Features (week fourth: high-speed export/import)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Oracle Database 10g: Best New Features (week fourth: high-speed export/import)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support