Three ways to export hive data (iv)

Source: Internet
Author: User
Tags split

Note: Original from http://blog.csdn.net/lifuxiangcaohui/article/details/40589881

Today we'll talk about three different ways to export data in hive.
Depending on where they are exported, these methods are divided into three types:
(1), export to the local file system;
(2), export to HDFs;
(3), export to another table in hive.
In order to avoid the simple text, I will use the command to explain step-by-step. first, export to the local file system

  
hive> Insert overwrite local directory '/HOME/WYP/WYP '
    > select * from WYP;

Copy Code

This hql execution needs to enable MapReduce completion, after running this statement, will be in the local file system/home/wyp/wyp directory generated files, this file is the result of reduce (the generated file name is 000000_0), We can look at the contents of this file:

[Wyp@master ~/wyp]$ vim 000000_0
5^awyp1^a23^a131212121212
6^awyp2^a24^a134535353535
7^Awyp3^A25^ A132453535353
8^awyp4^a26^a154243434355
1^awyp^a25^a13188888888888
2^atest^a30^a13888888888888
3^azs^a34^a899314121

Copy Code

As you can see, this is all the data in the WYP table. The delimiter between the columns in the data and the columns is ^a (ASCII code is \00001).

Unlike importing data to hive, you cannot export data using INSERT INTO:

  
hive> insert into local directory '/HOME/WYP/WYP ' > select * from WYP; Noviablealtexception (79@[]) at Org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause ( hiveparser_selectclauseparser.java:683) at Org.apache.hadoop.hive.ql.parse.HiveParser.selectClause ( hiveparser.java:30667) at Org.apache.hadoop.hive.ql.parse.HiveParser.regular_body (hiveparser.java:28421) A T Org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement (hiveparser.java:28306) at ORG.APACHE.HADOOP.HIVE.QL.P Arse. Hiveparser.querystatementexpression (hiveparser.java:28100) at Org.apache.hadoop.hive.ql.parse.HiveParser.execStatement (hiveparser.java:1213) at Org.apache.hadoop.hive.ql.parse.HiveParser.statement (hiveparser.java:928) at Org.apache.hadoop.hive.ql.parse.ParseDriver.parse (parsedriver.java:190) at Org.apache.hadoop.hive.ql.Driver.compile (driver.java:418) at Org.apache.hadoop.hive.ql.Driver.compile (Driver.java:337) at Org.apache.hadoop.hive.ql.Driver.run (driver.java:902) at ORG.APACHE.HADOOP.HIVE.CLI.CLIDRIVER.P
        Rocesslocalcmd (clidriver.java:259) at ORG.APACHE.HADOOP.HIVE.CLI.CLIDRIVER.PROCESSCMD (clidriver.java:216) At Org.apache.hadoop.hive.cli.CliDriver.processLine (clidriver.java:413) at ORG.APACHE.HADOOP.HIVE.CLI.CLIDRIVER.R Un (clidriver.java:756) at Org.apache.hadoop.hive.cli.CliDriver.main (clidriver.java:614) at Sun.reflect.Nat IVEMETHODACCESSORIMPL.INVOKE0 (Native Method) at Sun.reflect.NativeMethodAccessorImpl.invoke (Nativemethodaccessori mpl.java:39) at Sun.reflect.DelegatingMethodAccessorImpl.invoke (delegatingmethodaccessorimpl.java:25) at J Ava.lang.reflect.Method.invoke (method.java:597) at Org.apache.hadoop.util.RunJar.main (runjar.java:212) Failed:par Seexception line 1:12 missing TABLE @ ' local ' near ' local ' on select clause line 1:18 cannot recognize input near ' direct Ory '/home/wyp/wyp ' SELect ' in SELECT clause
 

Copy code Two, export to HDFs

As simple as importing data to a local file system, it can be implemented with the following statement:

  
hive> Insert Overwrite directory '/home/wyp/hdfs '
    > select * from WYP;

Copy Code
The data will be saved in the/home/wyp/hdfs directory of HDFs. Note that the hql of the exported file to the local file system is less than one, and the data storage path is not the same. third, export to another table in hive

This is also the way hive data is imported, as follows:

 
hive> INSERT into table Test
    > partition (age= ")
    > select ID, Name, tel > from
    wyp;
#####################################################################
           here output a bunch of mapreduce task information, omitted here
###### ############################################################### Total
MapReduce CPU time spent:1 seconds 310 msec
OK
Time taken:19.125 seconds

hive> select * from test;
OK
5       wyp1    131212121212
6       wyp2    134535353535
7       wyp3    132453535353
8       wyp4    154243434355
1       wyp     13188888888888
2       test    13888888888888
3       zs 899314121 time
taken: 0.126 seconds, Fetched:7 row (s)

Copy Code

The attentive reader may ask how to import data into a file, and why is the column delimiter between columns of data not WYP table set? In fact, between the hive 0.11.0, the data is exported can not specify the delimiter between the columns, only with the default column delimiter, that is, the above ^a to split, so the data is not intuitive, it seems very inconvenient.
If you are using a hive version of 0.11.0, you can specify the delimiter between columns when you export the data.

Here is a detailed description:

In the Hive0.11.0 version, a new feature is introduced, that is, when the user outputs the hive query results to a file, the user can specify the column delimiter, and in the previous version it is not possible to specify the delimiters between the columns, which gives us a lot of constant, before the Hive0.11.0 version we generally use:

    hive> Insert overwrite local directory '/home/wyp/documents/result ' hive> select * from test; The copy code saves the file column with ^a (\X01) to split the 196?^a242^a?3 186^a?302?^a3 22?^a377^a?1 244^a?51^a?2 Copy the code note, above is for

    The display is convenient, and will \x01 writing ^a, in the actual text editor we are not see ^a, but a strange symbol. Now we can introduce a new feature with the Hive0.11.0 version, specifying the delimiter between the output result columns: hive> insert overwrite local directory '/home/wyp/documents/resul

    T ' hive> row format delimited hive> fields terminated by ' \ t ' hive> select * from test;
        Copy the code again to see the result of the output 196 242 3 186 302 3 22 377 1 244 51 2 Copy code The results look much better. If it is a map type, you can use the following statement to split the map key and value hive> insert overwrite local directory './test-04 ' hive> row forma T delimited hive> fields TERMINATED by' \ t ' hive> COLLECTION ITEMS TERMINATED by ', ' hive> maps KEYS TERMINATED by ': ' Hive> sele

    CT * from SRC;
 Copy Code

Based on the above, let's proceed further:

hive> Insert overwrite local directory '/home/yangping.wu/local '
    > Row format Delimited
    > fields Terminated by ' \ t '
    > select * from WYP;

Copy Code

[Wyp@master ~/local]$ vim 000000_0
5       wyp1      131212121212
6       wyp2      134535353535
7       wyp3      132453535353
8       wyp4      154243434355
1       wyp      13188888888888
2       test      13888888888888
3       ZS      34      899314121

Copy Code

In fact, we can also use the-E and-f parameters of hive to export data. Where-e means that the SQL statement is followed directly with double quotation marks, and-f is a file, the contents of the file is an SQL statement, as follows:

  
[Wyp@master ~/local] [        discuz_code_26        ]nbsp; hive-e "SELECT * from Wyp" >> local/wyp.txt
[wyp@master ~/local][        DISCUZ _code_26        ]nbsp; Cat Wyp.txt
5       wyp1      131212121212
6       wyp2      134535353535
7       wyp3      132453535353
8       wyp4      154243434355
1     wyp      13188888888888
2       test      13888888888888
3       zs      899314121

Copy Code
The results obtained are also separated by \ t. can also be implemented with the-F parameter:

[Wyp@master ~/local]$ cat Wyp.sql
SELECT * from Wyp
[wyp@master ~/local]$ hive-f wyp.sql >> local/wyp2.tx T

Copy Code

The result of the above statement is also \ t split.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.