Use SQLULDR2 to quickly import Oracle data into Greenplum database without landing

Source: Internet
Author: User

Environmental information:

greenplum:greenplum Version: ' Postgres (greenplum Database) 4.3.8.2 build 1 '

Oracle:Release 11.2.0.1.0 Production

1. Oracle Client Deployment

Before using the SQLULDR2 tool, you first need to deploy the Oracle client on the Master node of Greenplum ( hereinafter referred to as GP) because SQLULDR2 need to use shared libraries such as libclntsh.so .

If you want to save the hassle, also directly from the Oracle server to copy the installation directory, and then configure the following environment variables can be.

To prevent the configuration of Oracle environment variables from affecting the use of the GP database Client Tools, we create a single user for the Oracle client and then configure the Oracle environment variables separately. Not to participating with GP .

First create the Oracle user groups and users:

Groupadd-g-DBA

Groupadd-g 10001 Oinstall

Useradd-u 1002-d/var/lib/oracle-s/bin/bash-g oinstall-g dba-m Oracle

passwd Oracle

For example, we will directly copy the Oracle server's software Installation Catalog product and put it under /var/lib/oracle :

[Email protected] oracle]# pwd

/var/lib/oracle

[Email protected] oracle]# chown-r oracle:oinstall/var/lib/oracle

[email protected] oracle]# LL

DRWXRWXRWX 3 Oracle Oinstall 4096 Jul 22:14 product

-RWXRWXRWX 2 Oracle Oinstall 185766 Mar sqluldr2_linux64_10204.bin

Finally, we configure the relevant environment variables for Oracle to be added to the . BASHRC file in the Oracle 's home directory:

Export Oracle_base=/var/lib/oracle

Export Oracle_home=/var/lib/oracle/product/11.2.0/dbhome_1

Export Ld_library_path= $ORACLE _home/lib:/usr/lib

Export path= $ORACLE _home/bin:/sbin:/usr/sbin:/usr/ccs/bin:/usr/bin:/sbin: $PATH:/bin

Export tns_names= $ORACLE _home/network/admin

Export tns_admin= $ORACLE _home/network/admin

# CLASSPATH must include the following JRE locations:

Classpath=${oracle_home}/rdbms/jlib:${classpath}

Export classpath= $CLASSPATH:/opt/oracle/product/11/network/jlib

# also configure the GP 's environment information, and then use gpload to access the GP database.

source/usr/local/greenplum-db/greenplum_path.sh

Export Master_data_directory=/hadoopdata1/master/gpseg-1

Export pgport=5432

Export Pguser=gpadmin

Export Pgdatabase=template1

2. SQLULDR2 Installation Deployment

Download the sqluldr2_linux64_10204.bin file, if your environment is Windows or linux32 , download the corresponding file.

We can rename the file and put it in the following path or create a soft link ( which we have configured in path ):

[Email protected] ~]$ pwd

/var/lib/oracle

[email protected] ~]$ ll product/11.2.0/dbhome_1/bin/sqluldr2

-RWXRWXRWX 2 Oracle Oinstall 185766 Mar PRODUCT/11.2.0/DBHOME_1/BIN/SQLULDR2

Let's check it out:

[Email protected] ~]$ SQLULDR2

Sql*unloader:fast Oracle Text unloader (GZIP, Parallel), Release 4.0.1

(@) Copyright Lou fangxin (anysql.net) 2004-2010, All rights reserved.

License:free for non-commercial useage, else USD per server.

USAGE:SQLULDR2 Keyword=value [, Keyword=value,...]

Valid Keywords:

user = Username/[email protected]

sql = SQL file name

query = SELECT statement

field = Separator string between fields

Record = separator string between records

rows = Print progress for every given rows (default, 1000000)

File = Output file name (default:uldrdata.txt)

Log = log file name, prefix with + to append mode

Fast = Auto Tuning the session level parameters (YES)

Text = output Type (MYSQL, CSV, Mysqlins, Oracleins, FORM, SEARCH).

CharSet = Character Set name of the target database.

Ncharset= national Character set name of the target database.

parfile = Read command option from parameter file

For field and record, you can use the ' 0x ' to specify hex character code,

\r=0x0d \n=0x0a |=0x7c, =0x2c, \t=0x09,: =0x3a, #=0x23, "=0x22" =0x27

You can see a very detailed help note.

3. gpload Brief Introduction

Gpload is a tool provided in GP for data loading, which can be seen as a encapsulation of gpfdist , so it relies on gpfdist components.

gpload Specifies the details of the load through its control file. So writing control files is the key to good gpload . gpload 's control file is written in the YAML1.1 document format, so it must be in a valid YAML format.

This piece of content I have introduced in the blog http://blog.csdn.net/jiangshouzhuang/article/details/51817520 , we can look at. This chapter focuses on migrating Oracle data directly into the GP database.

4. The principle of direct parallel loading of Oracle data into GP

Using gpload, we can realize the merge operation that can't be implemented directly in GP , and we can implement parallel fast loading without landing files by combining named pipes. This helps us to increase the load efficiency of massive data, and avoids the excessive storage overhead that is loaded with traditional landing files, as well as the loading performance bottleneck caused by the massive file landing process.

The method used in this article is through the Linux Named pipes,Gpload, and sqluldr2 programs ( fast offload for Oracle data ) enables fast loading from Oracle to GP without landing files.

5. Example

Scenario: Load Oracle 's iphone_user_detail table into the iphone_user_detail table in the GP database.

First, we create the gpload control file, thegpload.ctl content is as follows:

---

version:1.0.0.1

database:zhangyun_db

User:zhangyun

Host:cdha

port:5432

Gpload:

INPUT:

-SOURCE:

Local_hostname:

-Cdha

port:46666

FILE:

-/var/lib/oracle/iphone_user_detail/iphone_user_detail_file

-Format:text

-Delimiter:e ' \001 '

-Null_as: "

-Encoding:utf8

-Error_limit:25

-Error_table:err_iphone_user_detail

OUTPUT:

-TABLE:ann_gbs_safe.iphone_user_detail

-Mode:insert

Then write the script export_iphone_user_detail.sh, which is used to perform data migration work:

#!/bin/bash

Mknod/var/lib/oracle/iphone_user_detail/iphone_user_detail_file P

SQLULDR2 user=zy/[email protected]:1521/salesdb query= "Select/*+ Parallel (8) */Deptno,fmi,location,iphone_number From Pps.iphone_user_detail "field=0x01 Charset=utf8 safe=yes File=/var/lib/oracle/iphone_user_detail/iphone_user_ Detail_file &

Gpload-f gpload.ctl-v-L Gpload.log

Rm-rf/var/lib/oracle/iphone_user_detail/iphone_user_detail_file

6. Execute the export_iphone_user_detail.sh script and view the log

Before executing the script, make sure that the iphone_user_detail table has been created in GP .

[Email protected] ~]$ bash export_iphone_user_detail.sh

2016-07-13 14:47:00|info|gpload Session started 2016-07-13 14:47:00

2016-07-13 14:47:00|info|started gpfdist-p 21211-p 21212-f "/var/lib/oracle/iphone_user_detail/iphone_user_detail_ File "-T 30

0 rows exported at 2016-07-13 14:47:03, size 0 MB.

1000000 rows exported at 2016-07-13 14:47:10, size 719 MB.

2000000 rows exported at 2016-07-13 14:47:18, size 1451 MB.

3000000 rows exported at 2016-07-13 14:47:25, size 2154 MB

.............................

When the execution is complete, you can go to the GP database to view it.

zhangyun_db=# Select COUNT (*) from Iphone_user_detail;

Count

--------

194000000

(1 row)

Use SQLULDR2 to quickly import Oracle data into Greenplum database without landing

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.