Sort large files using Oracle External tables

Source: Internet
Author: User
Tags create directory

Problem: Sort unordered text files with 0.2 billion rows in a column to generate a new sorted text file.

1. Generate unordered file, bigfiletest. JavaCodeAs follows:

 
Import Java. io. bufferedwriter; import Java. io. filewriter; import Java. util. random; public class bigfiletest {static random = new random (); public static void main (string [] ARGs) throws exception {createfile ();} public static void createfile () throws exception {bufferedwriter fw = new bufferedwriter (New filewriter ("D: \ bigfiletest \ bigfile.txt"); For (INT I = 1; I <200000000; I ++) {FW. write (random. nextlong () + ""); FW. newline (); if (I %10000 = 0) {FW. flush ();}}}}

 

Javac bigfiletest. Java
Java bigfiletest

A 0.2 billion-row Upload File bigfile.txt is generated.

 

2. Create an External table

Create directory data_dir as 'd: \ bigfiletest \ '; Create Table bt_ext_test (A varchar2 (30 )) organization external (type oracle_loader default directory data_dir access parameters (records delimited by newline characterset zhs16gbk badfile data_dir: 'bigfile. bad 'discardfile data_dir: 'bigfile. DSC 'logfile' bigfile. log 'fields terminated by 0x '09' ldrtrim missing field values are null reject rows with all null fields) Location ('bigfile.txt ') parallelreject limit unlimited;

3. Use the sqlplus spool to generate new sorted files

 
Set echo offset feedback offset termout offset arrarsize 5000 set heading offset head offset trimout onset pagesize 0 set trimspool onset; inesize 30 spool result.txt select/* + parallel (bt_ext_test, 8) */* From bt_ext_test order by a; spool offexit;

On four dual-core CPUs and 64-bit oracle11.2, 8 parallel queries are used to generate sorting files for 32 minutes.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.