Three ways to export hive data (iv)

Last Update:2018-07-25 Source: Internet

Author: User

Tags split

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Note: Original from http://blog.csdn.net/lifuxiangcaohui/article/details/40589881

Today we'll talk about three different ways to export data in hive.
Depending on where they are exported, these methods are divided into three types:
(1), export to the local file system;
(2), export to HDFs;
(3), export to another table in hive.
In order to avoid the simple text, I will use the command to explain step-by-step. first, export to the local file system

　　
hive> Insert overwrite local directory '/HOME/WYP/WYP '
    > select * from WYP;

Copy Code

This hql execution needs to enable MapReduce completion, after running this statement, will be in the local file system/home/wyp/wyp directory generated files, this file is the result of reduce (the generated file name is 000000_0), We can look at the contents of this file:

[Wyp@master ~/wyp]$ vim 000000_0
5^awyp1^a23^a131212121212
6^awyp2^a24^a134535353535
7^Awyp3^A25^ A132453535353
8^awyp4^a26^a154243434355
1^awyp^a25^a13188888888888
2^atest^a30^a13888888888888
3^azs^a34^a899314121

Copy Code

As you can see, this is all the data in the WYP table. The delimiter between the columns in the data and the columns is ^a (ASCII code is \00001).

Unlike importing data to hive, you cannot export data using INSERT INTO:

　　
hive> insert into local directory '/HOME/WYP/WYP ' > select * from WYP; Noviablealtexception (79@[]) at Org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause ( hiveparser_selectclauseparser.java:683) at Org.apache.hadoop.hive.ql.parse.HiveParser.selectClause ( hiveparser.java:30667) at Org.apache.hadoop.hive.ql.parse.HiveParser.regular_body (hiveparser.java:28421) A T Org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement (hiveparser.java:28306) at ORG.APACHE.HADOOP.HIVE.QL.P Arse. Hiveparser.querystatementexpression (hiveparser.java:28100) at Org.apache.hadoop.hive.ql.parse.HiveParser.execStatement (hiveparser.java:1213) at Org.apache.hadoop.hive.ql.parse.HiveParser.statement (hiveparser.java:928) at Org.apache.hadoop.hive.ql.parse.ParseDriver.parse (parsedriver.java:190) at Org.apache.hadoop.hive.ql.Driver.compile (driver.java:418) at Org.apache.hadoop.hive.ql.Driver.compile (Driver.java:337) at Org.apache.hadoop.hive.ql.Driver.run (driver.java:902) at ORG.APACHE.HADOOP.HIVE.CLI.CLIDRIVER.P
        Rocesslocalcmd (clidriver.java:259) at ORG.APACHE.HADOOP.HIVE.CLI.CLIDRIVER.PROCESSCMD (clidriver.java:216) At Org.apache.hadoop.hive.cli.CliDriver.processLine (clidriver.java:413) at ORG.APACHE.HADOOP.HIVE.CLI.CLIDRIVER.R Un (clidriver.java:756) at Org.apache.hadoop.hive.cli.CliDriver.main (clidriver.java:614) at Sun.reflect.Nat IVEMETHODACCESSORIMPL.INVOKE0 (Native Method) at Sun.reflect.NativeMethodAccessorImpl.invoke (Nativemethodaccessori mpl.java:39) at Sun.reflect.DelegatingMethodAccessorImpl.invoke (delegatingmethodaccessorimpl.java:25) at J Ava.lang.reflect.Method.invoke (method.java:597) at Org.apache.hadoop.util.RunJar.main (runjar.java:212) Failed:par Seexception line 1:12 missing TABLE @ ' local ' near ' local ' on select clause line 1:18 cannot recognize input near ' direct Ory '/home/wyp/wyp ' SELect ' in SELECT clause

Copy code Two, export to HDFs

As simple as importing data to a local file system, it can be implemented with the following statement:

　　
hive> Insert Overwrite directory '/home/wyp/hdfs '
    > select * from WYP;

Copy Code
The data will be saved in the/home/wyp/hdfs directory of HDFs. Note that the hql of the exported file to the local file system is less than one, and the data storage path is not the same. third, export to another table in hive

This is also the way hive data is imported, as follows:

　
hive> INSERT into table Test
    > partition (age= ")
    > select ID, Name, tel > from
    wyp;
#####################################################################
           here output a bunch of mapreduce task information, omitted here
###### ############################################################### Total
MapReduce CPU time spent:1 seconds 310 msec
OK
Time taken:19.125 seconds

hive> select * from test;
OK
5       wyp1    131212121212
6       wyp2    134535353535
7       wyp3    132453535353
8       wyp4    154243434355
1       wyp     13188888888888
2       test    13888888888888
3       zs 899314121 time
taken: 0.126 seconds, Fetched:7 row (s)

Copy Code

The attentive reader may ask how to import data into a file, and why is the column delimiter between columns of data not WYP table set? In fact, between the hive 0.11.0, the data is exported can not specify the delimiter between the columns, only with the default column delimiter, that is, the above ^a to split, so the data is not intuitive, it seems very inconvenient.
If you are using a hive version of 0.11.0, you can specify the delimiter between columns when you export the data.

Here is a detailed description:

In the Hive0.11.0 version, a new feature is introduced, that is, when the user outputs the hive query results to a file, the user can specify the column delimiter, and in the previous version it is not possible to specify the delimiters between the columns, which gives us a lot of constant, before the Hive0.11.0 version we generally use:

    hive> Insert overwrite local directory '/home/wyp/documents/result ' hive> select * from test; The copy code saves the file column with ^a (\X01) to split the 196?^a242^a?3 186^a?302?^a3 22?^a377^a?1 244^a?51^a?2 Copy the code note, above is for

    The display is convenient, and will \x01 writing ^a, in the actual text editor we are not see ^a, but a strange symbol. Now we can introduce a new feature with the Hive0.11.0 version, specifying the delimiter between the output result columns: hive> insert overwrite local directory '/home/wyp/documents/resul

    T ' hive> row format delimited hive> fields terminated by ' \ t ' hive> select * from test;
        Copy the code again to see the result of the output 196 242 3 186 302 3 22 377 1 244 51 2 Copy code The results look much better. If it is a map type, you can use the following statement to split the map key and value hive> insert overwrite local directory './test-04 ' hive> row forma T delimited hive> fields TERMINATED by' \ t ' hive> COLLECTION ITEMS TERMINATED by ', ' hive> maps KEYS TERMINATED by ': ' Hive> sele

    CT * from SRC;
 Copy Code

Based on the above, let's proceed further:

hive> Insert overwrite local directory '/home/yangping.wu/local '
    > Row format Delimited
    > fields Terminated by ' \ t '
    > select * from WYP;

Copy Code

[Wyp@master ~/local]$ vim 000000_0
5       wyp1      131212121212
6       wyp2      134535353535
7       wyp3      132453535353
8       wyp4      154243434355
1       wyp      13188888888888
2       test      13888888888888
3       ZS      34      899314121

Copy Code

In fact, we can also use the-E and-f parameters of hive to export data. Where-e means that the SQL statement is followed directly with double quotation marks, and-f is a file, the contents of the file is an SQL statement, as follows:

　　
[Wyp@master ~/local] [        discuz_code_26        ]nbsp; hive-e "SELECT * from Wyp" >> local/wyp.txt
[wyp@master ~/local][        DISCUZ _code_26        ]nbsp; Cat Wyp.txt
5       wyp1      131212121212
6       wyp2      134535353535
7       wyp3      132453535353
8       wyp4      154243434355
1     wyp      13188888888888
2       test      13888888888888
3       zs      899314121

Copy Code
The results obtained are also separated by \ t. can also be implemented with the-F parameter:

[Wyp@master ~/local]$ cat Wyp.sql
SELECT * from Wyp
[wyp@master ~/local]$ hive-f wyp.sql >> local/wyp2.tx T

Copy Code

The result of the above statement is also \ t split.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More