The new utility in Oracle Database 10 Gb brings its performance and multi-functionality to a new level.
The new import and export features added in the Oracle Database 10 Gb are Oracle Data Pump (Data Pump, this has completely changed the way database users are used to working on the customer/server of the past several generations of Oracle databases. Now the server can run the Export and Import tasks [blocked ads]. You can quickly load or unload a large amount of data in parallel mode, and you can adjust the degree of parallelism during the running process. The Export and Import tasks can now be restarted, so failure does not necessarily mean starting from scratch. APIs are publicly available and easy to use. It is very easy to create an import and export task with PL/SQL. Once started, these tasks run in the background, but you can check the status and modify the task from anywhere through the client utility.
Architecture
Before the Oracle database is 10 GB (from oracle7 to Oracle9i), the import and export utilities run as the client and complete a lot of work. The exported data is read by the database instance, transmitted to the exported client program through a connection, and then written to the disk. All data is operated by a single thread throughout the export process. Today's data volume is much larger than this architecture was initially adopted, making a single export process a bottleneck because the performance of the export task is limited by the throughput supported by the Export utility.
With the 10 Gb Oracle database and the new data pump architecture, all the work is now done by database instances. Database instances can use two methods to concurrently process these tasks: creating multiple Data Pump Processes to read/write data being exported/imported, and establish a parallel I/O server process to quickly select (select) or insert (insert) the data. In this way, the single process bottleneck will no longer exist.
The Data Pump task is created, monitored, and adjusted using the new DBMS_DATAPUMP PL/SQL API. The new import and export utility (impdp and expdp respectively) is only a command line interface for this API. You can use the data pump export utility to initialize a task, such as an export task. Then you can close your client, go home for the night and enjoy dinner, And Your task will continue to run. In the middle of the night, you can re-connect to the task, check its status, or even improve the degree of parallelism, so that the system can do more work without users in the middle of the night. The next morning, you can reduce the degree of parallelism or even suspend the task to release resources for online users during the day.
The function of restarting a task is an important feature of the Data Pump architecture. You can stop and restart a Data Pump task at any time, for example, releasing resources for online users. You can also easily recover from space problems in the file system. If a 12-hour export task fails due to insufficient disk space after 11 hours, you no longer need to restart the task from the beginning and repeat the previous 11 hours. Instead, you can connect to this failed task, add one or more new dump files, and restart from the failed location so that you can complete the task in just one hour. This is useful when you process a large amount of data.
Access to the file system
It is advantageous for database administrators who remotely execute Export and Import tasks to process all file I/O by the server. Nowadays, users can easily telnet or SSH to a server on Unix-like systems (such as Linux, initialize an export or import task running on the server in the command line mode. However, it is not that easy on other operating systems. Windows is the most obvious example. Before launching a Data Pump, to export a large amount of data from an Oracle database in a Windows system, you may have to run a command in front of the server console. Exporting data through TCP/IP connections is only feasible for small data volumes. The data pump has changed everything because even if you initialize an export or import task by running the Export and Import utility on your client, the task actually runs on the server, all I/O operations also occur on this server.
For security reasons, the Data Pump requires you to specify the target directory for storing the dump files you want to create or read through the Oracle directory object. For example:
Create directory export_dumps
As 'C: ';
Grant read, write
On directory export_dumps
To gennick;
I log on to my lab database as a system and run the preceding statements to create a directory object pointing to a temporary directory on my disk, to store exported dump files. The grant statement assigns the user gennick-that is, I-the permission to access this directory. I assign read/write permissions to myself because I will export and import data. You can assign a user a read permission to import data only.
Start an export task
You can use the new expdp utility to start an export task. Because parameters are different from old exp utilities, you must be familiar with these new parameters. You can specify parameters in the command line, but I used the parameter file in this article. I want to export my entire schema using the following parameters:
Dumpfile = GNIS % u. dmp
Directory = export_dumps
Logfile = gnis_export.log
Job_name = gnis_export
Dumpfile specifies the file to which I will write the exported data. The % u Syntax provides an incremental counter to obtain the names of gnis01.dmp and gnis02.dmp. Directory specifies my target directory.
My logfile parameter specifies the name of the log file, which is created by default for each export task. Job_name specifies a name for the task. I chose a name that is easy to remember (and input), because I may need to connect to this task later. Note that when specifying the task name, do not conflict with the mode object name in your login mode (schema. In your login mode, the data pump creates a data table called the task master table. The table name matches the task name. This data table traces the status of the task and is eventually written to the dump file as a record of the content contained in the file.
Listing 1 shows that an export task has been started. The first task is to estimate the required disk space. When the estimated value is displayed, press Ctrl-C to enter an interactive export Prompt window, and then return to the Command window of my operating system using the exit_client command. The export task is still running on the server.
NOTE: If I want to export data in parallel and distribute my I/O on two disks, I can modify the dumpfile parameter value and add the parallel parameter and value as follows:
Dumpfile = export_dumps01: GNIS % u. DMP,
Export_dumps02: GNIS % u. dmp
Parallel = 2
Note: In this parallel export task, the directory name is specified as part of the file name.
Check status
You can connect to a running task at any time to check its status. To connect to an export task, you must execute an expdp command and use the attach parameter to specify the task name. Listing 2 shows the connection to the gnis_export task. When you connect to a task, expdp displays the related information and current status of the task, and provides you with an export> prompt.
After you connect to a task, you can execute the status command to view the current status at any time. You can also run the continue_client command to return the log output status that shows the task progress. This command can be abbreviated as "continue" in Listing 4.
You can query the dba_datapump_jobs view to quickly view the status of all data pump tasks. You cannot obtain the details provided by the status command, but you can quickly view which tasks are being executed and which are in idle state. Another view that needs to be understood is dba_datapump_sessions, which lists all active Data Pump Processes.
Recover from fault
The ability to restart a task allows you to recover from some types of faults. For example, listing 5 shows the end of a log file for an export task that has used up the dump file space. However, nothing is lost. This task only enters the idle state. You can see this point when you connect to the task and view the status output. This status does not show the reason why the task is idle. Check the log file if the space of the dump file is insufficient.
After you connect to a task that stops due to insufficient space for the dump file, you can choose one of the two operations: You can use the kill_job command to stop the task, or add one or more dump files to continue running the task. If the disk space is insufficient, make sure that the files you added are on another disk with available space. You may need to create a new Oracle directory object to point to this new location.
Listing 6 uses the add_files command to add two files to my idle tasks. These two files are located in different directories, which are different from the Directories specified for the first dump file of the task. I use the start_job command to restart the task, and then use continue to view the remaining log output on the screen.
The import task is not affected by the insufficient space of the unmount (dump) file. However, they may be affected by insufficient table space or the inability to expand the table space. The import recovery process is basically the same as the export task. First, you can add a data file to the tablespace, expand a data file, or use other methods to provide available space. Connect to the task and run the start_job command. The import task continues from where it is interrupted.
Import selected data
The example in this article shows how to export all objects owned by gennick at the schema database level. To demonstrate some new features of the data pump, I want to import that data, and to make the problem more interesting, I listed the following requirements:
Import only The GNIS data table
Import the data table to the Michigan mode.
Import only those data rows related to Michigan
Do not import original storage Parameters
At the beginning, I can write the following four lines in my import parameter file:
Dumpfile = GNIS % u. dmp
Directory = export_dumps
Logfile = gnis_import.log
Job_name = gnis_import
There is nothing new about these four lines. They specify the dump file, directory, log file, and task name. According to our four requirements, I can use the include parameter to restrict the import operation to a data table that we are interested in:
Include = table: "= 'gnis '"
Include is an interesting parameter. When you need to import part of a dump file, you can use either of the following methods:
You can use one or more include parameters to list the objects you want to import.
You can use the exclude parameter to list the content you do not need and then import the remaining content.
Because I only need one object, it is much easier to explicitly include this object than explicitly exclude other objects. The first part of my include parameter value is the keyword table, indicating that the object I want to import is a data table (other may be a function or a process ). Next is a colon, followed by a where clause predicate. I explicitly want the data table name to be GNIS, so the predicate is "= 'gnis '". If necessary, you can write multiple detailed predicates. With the include and exclude parameters, you can specify exactly the granularity of the data to be imported or exported. I suggest you carefully read the documentation on these two parameters. They are powerful and multi-functional, which I cannot describe in this article.
I can easily change this mode and remap data tables from the GNIS mode to the Michigan mode:
Remap_schema = gennick: Michigan
I only need data rows about Michigan. To this end, I can use the query parameter to specify a where clause:
Query = "where gnis_state_abbr = 'mi '"
Query is also available in the old utility, but can only be used for export operations. The Data Pump also enables query to be used for import operations because the data pump utilizes the new external data table feature of Oracle. Whenever possible, the data pump will select a direct path to export or import data, including reading data from the database data file and then writing it directly to an export dump file, or read the dump file and write it directly into the database data file. However, when you specify the query parameter, the data pump uses an external data table. For an import task, the data pump uses the oracle_datapump access driver to create an external data table and runs an insert... select... from statement.
My last requirement is to avoid importing storage parameters related to exported data tables. I want the new GNIS table in Michigan mode to follow the default storage parameters of the default tablespace in this mode. The reason is that the default tablespace of Michigan is insufficient to accommodate the original size of the data table, but only to accommodate data rows related to Michigan. With the transform parameter, I can tell the import task not to include any data segment attributes related to the original table:
Transform = segment_attributes: N
This seems to be a trivial matter, but I used to expect the transform parameter of the old import utility to have this function many times. I often fail when trying to import a small amount of production data into the test system, because even if many of the sections that store production data are empty, the data size is much larger than that supported by my testing system. To import only one data table, creating a data table in advance is a solution to this problem. However, as the number of data tables increases, it is very difficult to create a table in advance. The simple switch such as transform can easily ignore all the data segment attributes in the dump file.
After placing all the options described above in a parameter file, I can call the import utility as follows:
Impdp Michigan/Password
Parfile = gnis_import.par
When importing data to an unauthorized user, you need to connect to the target mode. If you have the imp_full_database role, you can log on as your identity and import it to any target mode.
Performance and multi-functionality
The Oracle Data Pump significantly improves the performance compared to the previous export and import utilities. Most of this performance improvement comes from parallel operations on read/write dump files. You can specify the degree of parallelism to achieve the compromise between the speed and resource consumption you require.
Next step
Download the sample data used in this article
The Data Pump also makes good use of the innovative features of other latest Oracle database development. Flashback is used to ensure consistency of exported data. The flashback_scn and flashback_time parameters allow you to fully control this function. The direct-path API is used to improve performance whenever possible. When the direct path API is unavailable, the external data table and the new oracle_datapump external data table are used to access the driver for data transmission.
In addition to providing brand new performance, Data Pumps also provide you with flexibility. This is manifested in the implementation of include and exclude parameters, query parameters, transform parameters, and other parameters. These parameters allow you to precisely control the data and objects to be loaded and detached.
People are constantly focusing on "Big Data? The meaning of quot; is redefined, and the size of this database is only a dream ten years ago. In this world, the data pump is a good supplement to the tool library used by your database administrator, so that you can import and export data to and from the database at an unprecedented speed.
Jonathan gennick (Jonathan@Gennick.com) is an experienced Oracle database administrator and Oracle certified expert who lives in the upper peninsula of Michigan. He is managing the Oracle Article email list. You can visit gennick.com to learn about the content. Gennick recently wrote the Oracle Regular Expressions pocket reference, Oracle Regular Expression pocket manual (o''reilly & Associates, 2003 ).
Refer:
Http://hi.baidu.com/zbxlcm/blog/item/34b18812e3bd74d6f7039e9b.html