Sqlldr is a recommended method for processing large data volumes. It has many performance switches to minimize redo and undo generation, control Data Processing Methods (insert, append, replace, truncate)
Because the project needs to compare the performance of datapump, it is still not ideal, so we still want to use sqlldr. I did a simple test.
According to thomas kyte, the fastest way to load parallel execution paths is to write only formatted data blocks, minimizing redo and undo generation.
A csv file in sqlldr in Linux
Sqlldr Load file, ORA-01722: invalid number Solution
Oracle sqlldr imports text ending with carriage return and carriage return
Sqlldr import garbled Oracle client Character Set
Use default values to import data from Oracle sqlldr and improve performance
First, write the following script. Metadata can be dynamically generated from a user's table.
Sqlplus-s $1 <EOF
Set pages 0
Col object_name format a30
Set linseize 10000
Set feedback off
Set colsep ','
Spool $ 2.lst
Select * from $2;
Spool off;
EOF
The data generated after running is roughly as follows.
[Ora11g @ rac1 sqlldr] $ ksh spooldata. sh n1/n1 t
370753,102 05, KU $ _ DOMIDX_OBJNUM_VIEW, VIEW
370754,102 07, KU $ _ OPTION_OBJNUM_T, TYPE
370755,102 08, KU $ _ EXPREG, VIEW
370756,102 10, SYS_YOID0000010209 $, TYPE
370757,102 09, KU $ _ OPTION_OBJNUM_VIEW, VIEW
370758,102 11, KU $ _ OPTION_VIEW_OBJNUM_VIEW, VIEW
370759,102 12, KU $ _ MARKER_T, TYPE
370760,102 14, SYS_YOID0000010213 $, TYPE
370761,102 13, KU $ _ MARKER_VIEW, VIEW
370762,102 15, KU $ _ TABPROP_VIEW, VIEW
370763,102 16, KU $ _ PFHTABPROP_VIEW, VIEW
370764,102 17, KU $ _ REFPARTTABPROP_VIEW, VIEW
370765,102 18, KU $ _ MVPROP_VIEW, VIEW
370766,102 19, KU $ _ MVLPROP_VIEW, VIEW
370767,102 20, KU $ _ TTS_VIEW, VIEW
370768,102 21, KU $ _ TAB_TS_VIEW, VIEW
370769,102 22, KU $ _ TTS_IND_VIEW, VIEW
370770,102 23, KU $ _ IND_TS_VIEW, VIEW
370771,102 24, KU $ _ CLU_TS_VIEW, VIEW
Then prepare the control file sqlldr. ctl and load the data from t to tt.
Load data
Into table tt
Fields terminated ','
(Id, object_id, object_name, object_type)
Try to import:
[Ora11g @ rac1 sqlldr] $ sqlldr n1/n1 control = sqlldr. ctl data = t. lst
SQL * Loader: Release 11.2.0.3.0-Production on Tue May 27 08:09:25 2014
Copyright (c) 1982,201 1, Oracle and/or its affiliates. All rights reserved.
However, there is no feedback.
View automatically generated sqlldr. log
There are the following errors.
Column Name Position Len Term Encl Datatype
--------------------------------------------------------------------------
Id first *, CHARACTER
OBJECT_ID NEXT *, CHARACTER
OBJECT_NAME NEXT *, CHARACTER
OBJECT_TYPE NEXT *, CHARACTER
Record 1: Rejected-Error on table TT, column OBJECT_TYPE.
Field in data file exceeds maximum length
Record 2: Rejected-Error on table TT, column OBJECT_TYPE.
Field in data file exceeds maximum length
Record 3: Rejected-Error on table TT, column OBJECT_TYPE.
Field in data file exceeds maximum length
Record 4: Rejected-Error on table TT, column OBJECT_TYPE.
Field in data file exceeds maximum length
After trying for a while, we finally found that the length setting for the set linesize was relatively large. when parsing Based on commas, the length of the last field contains the remaining space. When it is loaded, it will be found that the length is too large. The length of the table is exceeded.
In this case, I cannot specify the length one by one.
At this time, I thought of the trimspool function and tried to make it work.
The spooldata. sh script content is as follows:
Sqlplus-s $1 <EOF
Set pages 0
Col object_name format a30
Set linesize 10000
Set trimspool on
Set feedback off
Set colsep ','
Spool $ 2.lst
Select * from $2 where rownum <20;
Spool off;
EOF
If you try to import the file again, there will be no problem.
[Ora11g @ rac1 sqlldr] $ sqlldr n1/n1 control = sqlldr. ctl data = t. lst
SQL * Loader: Release 11.2.0.3.0-Production on Tue May 27 08:14:44 2014
Copyright (c) 1982,201 1, Oracle and/or its affiliates. All rights reserved.
Commit point reached-logical record count 19
So far, let's start to see how much Performance Improvement the direct method has.
The following table lists the testing results of nearly 0.8 million data records.
When the direct method is not used, data will be loaded at a certain frequency (50 records by default), which takes 79 seconds, and 10 thousand data records will be uploaded in one second.
Commit point reached-logical record count 793480
Commit point reached-logical record count 793544
Commit point reached-logical record count 793608
Commit point reached-logical record count 793672
Commit point reached-logical record count 793736
Commit point reached-logical record count 793800
Commit point reached-logical record count 793864
Commit point reached-logical record count 793928
Commit point reached-logical record count 793992
Commit point reached-logical record count 794056
Commit point reached-logical record count 794120
Commit point reached-logical record count 794184
Commit point reached-logical record count 794248
Commit point reached-logical record count 794312
Commit point reached-logical record count 794369
However, when direct = true is used, the speed is significantly improved, and the output is also very simple, just the following line. It takes 8 seconds to process 0.1 million data records in one second.
8 s
[Ora11g @ rac1 sqlldr] $ sqlldr n1/n1 direct = true control = sqlldr. ctl data = t. lst
SQL * Loader: Release 11.2.0.3.0-Production on Tue May 27 07:56:31 2014
Copyright (c) 1982,201 1, Oracle and/or its affiliates. All rights reserved.
Load completed-logical record count 794369.