How to use Oracle External tables to import data from text files, oracle Text Files
Recently, colleagues are busy with Data Consistency comparison. They need to compare the data in different text files. Some files are large and have many records. If they are opened in a common text editor, it is obvious that, it will be very difficult, or even cannot be opened.
Based on this, you can import the data of the text file to the database for comparison at the collection level.
So how to import text file data into the database? Here, we mainly use the External table feature of Oracle.
Oracle External tables support two types of drivers: ORACLE_LOADER, which means that the data in external tables must be from file files, and ORACLE_DATAPUMP, the data in the External table must be a binary dump file. The dump file is a file previously filled with Oracle internal table data imported into the external table. Obviously, Oracle wants to keep the data in the database for processing.
First, let's take a look at the size and record of the text file.
[oracle@node2 ~]$ du -sm P_20150626010000_2002371.0003479598 274 P_20150626010000_2002371.0003479598[oracle@node2 ~]$ wc -l P_20150626010000_2002371.0003479598 2899265 P_20150626010000_2002371.0003479598
From the above output, we can see that the file is 274 MB and has 2899265 records.
Second, create an external TABLE statement.
CREATE TABLE emp_load (subsid number(18), servnumber VARCHAR2(20 CHAR), subsprodid NUMBER(18), prodid VARCHAR2(32 CHAR), startdate date, enddate date, owner VARCHAR2(4 CHAR))ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY tmp ACCESS PARAMETERS (RECORDS DELIMITED BY NEWLINE FIELDS TERMINATED BY "|" ( subsid DECIMAL EXTERNAL, servnumber CHAR(20), subsprodid DECIMAL EXTERNAL, prodid CHAR(32), startdate date "yyyymmddhh24miss", enddate date "yyyymmddhh24miss", owner CHAR(4) ) ) LOCATION ('P_20150626010000_2002371.0003479598') );
Note: The Directory tmp must exist because it is executed under the scott user, so the scott user must have read and write permissions on the path.
3. Execute the table creation statement under the scott user.
4. Check whether the generated external table is faulty.
SQL> select count(*) from emp_load; COUNT(*)------------------ 2899265
The number of records is consistent with the number of records viewed by wc-l.
Note that no error is reported during table creation, which does not necessarily mean that the data has been successfully loaded into the external table. You must query the External table to determine whether the data has been successfully loaded. If an error occurs, see the log files generated in the current directory. In this example, EMP_LOAD_2000.bad and EMP_LOAD_2000.log are used.
Of course, data in external tables can only be queried and cannot be used for DML operations. For example, you can randomly delete a piece of data in a table.
SQL> delete from emp_load where rownum=1;delete from emp_load where rownum=1 *ERROR at line 1:ORA-30657: operation not supported on external organized table
If you want to perform DML operations on the External table data, you can first import the data of the External table to the internal table. The procedure is as follows:
SQL> create table test as select * from emp_load where 1=0;Table created.Elapsed: 00:00:00.26SQL> INSERT /*+ APPEND */ INTO test select * from emp_load;2899265 rows created.Elapsed: 00:01:00.29SQL> select * from test where rownum<=100;select * from test where rownum<=100 *ERROR at line 1:ORA-12838: cannot read/modify an object after modifying it in parallelElapsed: 00:00:00.10SQL> commit;Commit complete.Elapsed: 00:00:00.07SQL> select * from test where rownum<=100;
Here, in order to save time, I used direct path insertion. It can be seen that inserting nearly 3 million data only took about 1 minute. Considering the database on my virtual machine, only MB of memory is allocated to it, and the loading efficiency is considerable.
SQL> show parameter memoryNAME TYPE VALUE------------------------------------ ----------- ------------------------------hi_shared_memory_address integer 0memory_max_target big integer 300Mmemory_target big integer 300Mshared_memory_address integer 0
Note: In SQL * PLUS, the output of the number field is 10 by default. As a result, a value such as 937116510102250300 may be displayed as 9.3712E + 17. Here, you can use set numwidth 18 to display the value of the complete number field.