From: http://blog.csdn.net/nsj820/article/details/6744842
SQL * loader is an oracle high-speed Batch Data Loading tool. This is a very useful tool that can be used to load data from multiple flat file formats to Oracle databases. Sqlldr can load a large amount of data in a very short time. It hasTwo operation modes:
Conventional path: sqlldr uses SQL insert to load data for us.
Direct path: In this mode, sqlldr does not use SQL, but directly formats database blocks, bypassing the entire SQL engine and undo generation, and possibly avoiding redo generation.To fully load data in a database without any data, the fastest way is to load data using parallel direct paths.
Regular path loading uses the SQL insert statement and the bond array buffers in the memory to load data to the Oracle database table. This process competes with other processes for memory resources within the SGA. If the database already supports the overhead of multiple concurrent processing processes, regular path loading will reduce the load performance.
Another overhead of loading using a regular path is that the loading process must search for the database to find the partially filled blocks of the mounted table and try to fill these blocks. This is very effective for daily transaction processing, but it is an additional overhead for conventional path loading.
It is best or sometimes you must use the conventional path loading method, rather than the direct path loading method:
1. If the mounted table is indexed and concurrently accessed, or if you want to insert or delete the table, you must use the regular path for loading.
2. When using SQL functions in the control file, you must use the regular path for loading. SQL functions are not applicable when direct path loading is used.
3. When the mounted table is a cluster table.
4. When loading a small number of records to a large index table, or when the table has reference integrity or check constraints.
5. When loading is performed on different platforms through SQL * Net or net8, in order to use direct path loading, the two nodes must belong to the same computer family and use the same character set.
You do not need to use SQL insert statements or the key array cache to directly load and format input data to Oracle data blocks and write them directly into the database. Note that direct path loading always inserts data above the maximum water level of the table, which eliminates the time used to search for partial filling blocks.
Sqlldr is a command tool. It is not an API and cannot be called from PL/SQL.
SQL * loader has many functions, including the following capabilities:
You can load data from multiple input data files of different file types;
The input record can be a fixed-length or variable-length record;
Multiple tables can be loaded in the same operation, and the selected records can be logically loaded into each table;
Before the input data is loaded into the table, you can use the SQL function for it;
Multiple physical records can be compiled into one logical record. Similarly, SQL can extract one physical record and load it as multiple logical records;
Nested, nested tables, varrays, and lobs (including blogclobnlobbfile) are supported ).
SQL * loader component:
0. Control File
The control file contains information describing the input data (such as the layout and type of the input data), as well as information about the target table, control files can even contain the data to be loaded.
1. SQL * loader input data:
SQL * loader can receive data files in different formats. Files can be stored on disks or tapes, or records can be nested into control files. The record format can be fixed-length or variable-length. A fixed-length record is a record with the same fixed length, the data fields in each record also have the same fixed length, data type, and location.
2. SQL * loader output:
(1) load data
(2) infile *
(3) into Table Dept
(4) fields terminated ','
(5) (deptno, dname, Loc)
(6) begindata
(7) 10, sales, Virginia
(8) 20, accounting, Virginia
(9) 30, consulting, Virginia
(10) 40, finance, Virginia
Load data (1): This will tell sqlldr what to do (in this example, it indicates the data to be loaded ). Sqlldr can also execute continue_load, that is, Continue loading. This option can be used only when a multi-table direct path is loaded.
Infile * (2): This tells sqlldr that the data to be loaded is actually included in the control file, as shown in Row 6-10. You can also specify the file name of another file that contains data. If you want to use a command line parameter, You can overwrite this infile statement. [The command line option overwrites the Control File Settings].
Into Table dept (3): This tells sqlldr which table to load data.
Fileds terminated by ',' (4): indicates that sqlldr data should be separated by commas.
(Deptno, dname, Loc) (5): tells sqlldr the sequence and Data Type of the columns to be loaded in the input data. This refers to the Data Type of the input stream, rather than the data type in the database. In this example, the column data type is Char (255) by default ).
Begindata (6): tells sqlldr that you have completed the description of the input data. The following rows (rows 7-10) are the specific data to be loaded to the dept table.
To use the preceding control file, create an empty dept table:
Create Table Dept
(Deptno number (2) Constraint dept_pk primary key,
Dname varchar2 (14 ),
Loc varchar2 (13)
);
And run the following command:
Sqlldr userid =/control = demo1.ctl
Table loading method:
InsertThis is the default method. This method assumes that the table is empty before data loading. If there is a record in the table, sqlldr exits and reports: SQLLDR-601: For insert option, table must be empty, error on table Dept
AppendThis method allows records to be added to database tables without affecting existing records.
ReplaceThis method first deletes existing records in the table and then starts loading new records. Note: When the old record is deleted, any deletion trigger on the table will be triggered.
TruncateBefore loading data, use the SQL command truncate to delete old records. This method is much faster than replace because the trigger is removed and no rollback is created. To use this method, constraints must be disabled and specific permissions must be granted.
How to load the bounded data?
Delimited data is the data separated by a special character.
Example:
Fields terminated by ', 'optionally enclosed '"'
In the preceding example, data fields are separated by commas (,). Each field can be enclosed by double quotation marks.
Terminated by '9' (use a tab in hexadecimal format; When ASCII is used, the tab is 9)
Terminated by whitespace
How to load data in a fixed format?
Usually there is a flat file generated by an external system, and this is a fixed-length file, which contains positional data ). to load fixed-position data with a fixed width, the position keyword is used in the control file:
Load data
Infile *
Into Table Dept
Replace
(Deptno position ),
Dname position (3: 16 ),
Loc position (17: 29)
)
Begindata
10 accountin Virginia, USA
You can use the relative offset.
Deptno position ),
Dname position (*: 16)
Dname is a string of 3 to 16 characters.
How to load a date?
You only need to control the date data type in the file and specify the date mask to be used. The date mask is the same as the date mask used in to_char and to_date in the database.
How to use functions to load data?
You only need to add the function to the control file.
(Dname "upper (: dname )"
)
Trailing nullcols will cause the bound variable to become null. If data in a column does not exist in the input record, sqlldr will bind a null value to the column.
The following are some tips for adding SQL * loader performance:
1) Use a location domain instead of a separate domain. The separator domain requires the loader to search for data to find the delimiter. The location domain is faster, because the loader only needs to perform simple pointer operations.
2) specify the maximum length for the ending domain so that each bundled array is inserted more effectively.
3) pre-allocate enough storage space. When data is loaded, the table requires more space. Oracle allocates more intervals to accommodate data. If this operation is performed frequently during data loading, processing overhead will be very large. Computing or estimation of Bucket requirements before loading allows you to create necessary buckets in advance.
4) if possible, avoid using nullif and defaultif clauses in the control file. These two clauses cause Column Operations for each record to be loaded.
5) split data files and run regular path loading in parallel.
6) Reduce the number of submissions by using the CLI parameter rows.
7) Avoid unnecessary Character Set conversion and ensure that the client's nls_lang environment is the same as that on the server.
8) use the direct path Loading Method whenever possible.
9) when the direct path loading method is used, the maximum index of the table is pre-ordered and the sorted indexes clause is used.
10) when using the direct path loading method, try to use the parallel direct path option.
11) Use as few redo logs as possible during direct path loading. There are three different levels of control to achieve this:
Prohibit database archiving;
Use the keyword unrecoverable in the control file;
Use the nolog attribute to modify tables and/or indexes.
Examples of parallel loading of direct paths:
/Opt/APP/Oracle/product/10.2.0/bin/sqlldr APS/APS control =/home/Oracle/aps_load/CTL/ap_contract.ctlDirect = true
Parallel = trueLog =/home/Oracle/aps_load/log/$ yesterday/ap_contract _ $ yesterday. log bad =/home/Oracle/aps_load/bad/due_bill _ $ yesterday. bad rows = 10000 readsize = 20000000 bindsize = 20000000 discard =/home/Oracle/aps_load/bad/discard_ts.dis