Sqoop processing Clob and BLOB fields

Source: Internet
Author: User
Tags sqoop

[Author]: Kwu

Sqoop handles Clob and Blob fields, Clob is large text in Oracle, andblobs store binary files. This type of field is encountered when importing hive or HDFS requires special handling.

1. Test Tables in Oracle

CREATE TABLE    t_lob    (        A INTEGER,        B CLOB,        C BLOB    )

Test data

Insert into T_lob (A, B, C) VALUES (1, ' Clob test ', To_blob (' 3456 '));


2. Sqoop Script

Import--append--connectjdbc:oracle:thin: @localhost: 1521/orcl--usernamewuke--passwordabcd1234--tablebdc_test. T_lob--columns "A,b,c"--target-dir/tmp/t_lob-m1
Execute script

Sqoop--options-file./importhdfs.opt



3. View the generated HDFs file


As can be seen, the Clob field is imported into HDFs is normal display text, blob is binary file exported to HDFs display as 16 binary

The 16 binary conversion to string can be done in the following way, actually through the shift operation:

Package com.ganymede.test;/** * Hex conversion operation * @author Ganymede * */public class Hex {/** * lowercase character array used to build the output of hexadecimal characters */private St Atic final char[] digits_lower = {' 0 ', ' 1 ', ' 2 ', ' 3 ', ' 4 ', ' 5 ', ' 6 ', ' 7 ', ' 8 ', ' 9 ', ' A ', ' B ', ' C ', ' d ', ' e ', ' f '};/** * with An array of uppercase characters that establish the output of the hexadecimal character */private static final char[] Digits_upper = {' 0 ', ' 1 ', ' 2 ', ' 3 ', ' 4 ', ' 5 ', ' 6 ', ' 7 ', ' 8 ', ' 9 ', ' A ', ' B ', ' C ', ' D ', ' E ', ' F '};/** * Converts a byte array to a hexadecimal character array * * @param data * byte[] * @return hex char[] */public static Char[] Encodehex (byte[] data) {return Encodehex (data, true);} /** * Converts a byte array to an array of hexadecimal characters * * @param data * byte[] * @param tolowercase * &LT;CODE&GT;TRUE&LT;/CODE&GT ; Convert to lowercase format, <code>false</code> convert to uppercase format * @return hex char[] */public static char[] Encodehex (byte[) data, Boolea N tolowercase) {return Encodehex (data, tolowercase?) Digits_lower:digits_upper);} /** * Converts a byte array to an array of hexadecimal characters * * @param data * byte[] * @param todigits * char[for control output] * @return Hex ChaR[] */protected static char[] Encodehex (byte[] data, char[] todigits) {int L = data.length;char[] out = new Char[l <&lt ; 1];//characters form the hex value.for (int i = 0, j = 0; i < L; i++) {out[j++] = todigits[(0xF0 & Data[i]) &G t;>> 4];out[j++] = todigits[0x0f & Data[i]];} return out;}  /** * Convert byte array to hexadecimal string * * @param data * byte[] * @return hex string */public static string Encodehexstr (byte[] Data) {return Encodehexstr (data, true);}  /** * Convert a byte array to a hexadecimal string * * @param data * byte[] * @param tolowercase * <code>true</code> Convert to lowercase format, <code>false</code> convert to uppercase format * @return hex string */public static string Encodehexstr (byte[] data, bo Olean tolowercase) {return encodehexstr (data, tolowercase?) Digits_lower:digits_upper);} /** * Convert byte array to hexadecimal string * * @param data * byte[] * @param todigits * char[for control output] * @return Hex Stri ng */protected static String encodehexstr (byte[] data, char[] TodigiTS) {return new String (Encodehex (data, todigits));}             /** * Converts a hexadecimal character array to an array of bytes * * @param data * Hex char[] * @return byte[] * @throws runtimeexception * If the source hexadecimal character array is a strange length, the run-time exception will be thrown */public static byte[] Decodehex (char[] data) {int len = data.length;if (len & 0x01)! = 0) {throw new RuntimeException ("ODD number of characters."); Byte[] out = new Byte[len >> 1];//II characters form the hex value.for (int i = 0, j = 0; j < Len; i++) {int F = Todigit (Data[j], j) << 4;j++;f = f | Todigit (Data[j], j); J++;out[i] = (byte) (F & 0xFF);} return out;} /** * Converts hexadecimal characters to an integer * * @param ch * Hex char * @param index * hexadecimal character position in character array * @return an integer * @th Rows RuntimeException * When CH is not a valid hexadecimal character, throws a run-time exception */protected static int todigit (char ch, int index) {int Digi  t = character.digit (CH, +), if (digit = =-1) {throw new runtimeexception ("illegal hexadecimal Character" + ch+ "at index "+ index);} return digit;} public static VOID Main (string[] args) {string srcstr = "string to be converted"; String encodestr = Encodehexstr (Srcstr.getbytes ()); String decodestr = new String (Decodehex (Encodestr.tochararray ())); System.out.println ("Before conversion:" + srcstr); System.out.println ("After conversion:" + encodestr); SYSTEM.OUT.PRINTLN ("After Restore:" + decodestr); System.out.println ("---------------------------------------");d ecodestr = new String (Decodehex (" 3435363738390d0a626c6f62 ". ToCharArray ())); SYSTEM.OUT.PRINTLN ("After Restore:" + decodestr);}}

For hive to be converted to string before storage, or to use UDF to convert directly after storage

Sqoop processing Clob and BLOB fields

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.