How to use Oracle External tables to import data from text files, oracle Text Files

Source: Internet
Author: User

How to use Oracle External tables to import data from text files, oracle Text Files

Recently, colleagues are busy with Data Consistency comparison. They need to compare the data in different text files. Some files are large and have many records. If they are opened in a common text editor, it is obvious that, it will be very difficult, or even cannot be opened.

Based on this, you can import the data of the text file to the database for comparison at the collection level.

So how to import text file data into the database? Here, we mainly use the External table feature of Oracle.

Oracle External tables support two types of drivers: ORACLE_LOADER, which means that the data in external tables must be from file files, and ORACLE_DATAPUMP, the data in the External table must be a binary dump file. The dump file is a file previously filled with Oracle internal table data imported into the external table. Obviously, Oracle wants to keep the data in the database for processing.

First, let's take a look at the size and record of the text file.

[oracle@node2 ~]$ du -sm P_20150626010000_2002371.0003479598 274    P_20150626010000_2002371.0003479598[oracle@node2 ~]$ wc -l P_20150626010000_2002371.0003479598 2899265 P_20150626010000_2002371.0003479598

From the above output, we can see that the file is 274 MB and has 2899265 records.

Second, create an external TABLE statement.

CREATE TABLE emp_load   (subsid number(18),    servnumber VARCHAR2(20 CHAR),    subsprodid NUMBER(18),    prodid VARCHAR2(32 CHAR),    startdate date,    enddate  date,     owner VARCHAR2(4 CHAR))ORGANIZATION EXTERNAL   (TYPE ORACLE_LOADER    DEFAULT DIRECTORY tmp    ACCESS PARAMETERS      (RECORDS DELIMITED BY NEWLINE        FIELDS TERMINATED BY "|"             ( subsid      DECIMAL EXTERNAL,               servnumber  CHAR(20),               subsprodid  DECIMAL EXTERNAL,               prodid  CHAR(32),                startdate date "yyyymmddhh24miss",               enddate date "yyyymmddhh24miss",               owner   CHAR(4)               )      )    LOCATION ('P_20150626010000_2002371.0003479598')   ); 

Note: The Directory tmp must exist because it is executed under the scott user, so the scott user must have read and write permissions on the path.

3. Execute the table creation statement under the scott user.

4. Check whether the generated external table is faulty.

SQL> select count(*) from emp_load;      COUNT(*)------------------       2899265

The number of records is consistent with the number of records viewed by wc-l.

Note that no error is reported during table creation, which does not necessarily mean that the data has been successfully loaded into the external table. You must query the External table to determine whether the data has been successfully loaded. If an error occurs, see the log files generated in the current directory. In this example, EMP_LOAD_2000.bad and EMP_LOAD_2000.log are used.

Of course, data in external tables can only be queried and cannot be used for DML operations. For example, you can randomly delete a piece of data in a table.

SQL> delete from emp_load where rownum=1;delete from emp_load where rownum=1            *ERROR at line 1:ORA-30657: operation not supported on external organized table

If you want to perform DML operations on the External table data, you can first import the data of the External table to the internal table. The procedure is as follows:

SQL> create table test as select * from emp_load where 1=0;Table created.Elapsed: 00:00:00.26SQL> INSERT /*+ APPEND */ INTO test select * from emp_load;2899265 rows created.Elapsed: 00:01:00.29SQL> select * from test where rownum<=100;select * from test where rownum<=100              *ERROR at line 1:ORA-12838: cannot read/modify an object after modifying it in parallelElapsed: 00:00:00.10SQL> commit;Commit complete.Elapsed: 00:00:00.07SQL> select * from test where rownum<=100;

Here, in order to save time, I used direct path insertion. It can be seen that inserting nearly 3 million data only took about 1 minute. Considering the database on my virtual machine, only MB of memory is allocated to it, and the loading efficiency is considerable.

SQL> show parameter memoryNAME                     TYPE     VALUE------------------------------------ ----------- ------------------------------hi_shared_memory_address         integer     0memory_max_target             big integer 300Mmemory_target                 big integer 300Mshared_memory_address             integer     0

Note: In SQL * PLUS, the output of the number field is 10 by default. As a result, a value such as 937116510102250300 may be displayed as 9.3712E + 17. Here, you can use set numwidth 18 to display the value of the complete number field.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.