Three different ways to export data in hive

Source: Internet
Author: User
Depending on where they are exported, these methods are divided into three types:
(1), export to the local file system;
(2), export to HDFs;
(3), export to another table in hive.
In order to avoid the simple text, I will use the command to explain step-by-step.


first, export to the local file system

  
hive> Insert overwrite local directory '/HOME/WYP/WYP '
> select * from WYP; Copy Code
This hql execution needs to enable MapReduce completion, after running this statement, will be in the local file system/home/wyp/wyp directory generated files, this file is the result of reduce (the generated file name is 000000_0), We can look at the contents of this file:

[Wyp@master ~/wyp]$ Vim 000000_0
5^awyp1^a23^a131212121212
6^awyp2^a24^a134535353535
7^awyp3^a25^a132453535353
8^awyp4^a26^a154243434355
1^awyp^a25^a13188888888888
2^atest^a30^a13888888888888
3^azs^a34^a899314121 Copy Code
As you can see, this is all the data in the WYP table. The delimiter between the columns in the data and the columns is ^a (ASCII code is \00001).

Unlike importing data to hive, you cannot export data using INSERT INTO:

  
hive> insert into local directory '/HOME/WYP/WYP '
> select * from WYP;
Noviablealtexception (79@[])
At Org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause (hiveparser_ selectclauseparser.java:683)
At Org.apache.hadoop.hive.ql.parse.HiveParser.selectClause (hiveparser.java:30667)
At Org.apache.hadoop.hive.ql.parse.HiveParser.regular_body (hiveparser.java:28421)
At Org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement (hiveparser.java:28306)
At Org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression (hiveparser.java:28100)
At Org.apache.hadoop.hive.ql.parse.HiveParser.execStatement (hiveparser.java:1213)
At Org.apache.hadoop.hive.ql.parse.HiveParser.statement (hiveparser.java:928)
At Org.apache.hadoop.hive.ql.parse.ParseDriver.parse (parsedriver.java:190)
At Org.apache.hadoop.hive.ql.Driver.compile (driver.java:418)
At Org.apache.hadoop.hive.ql.Driver.compile (driver.java:337)
At Org.apache.hadoop.hive.ql.Driver.run (driver.java:902)
At ORG.APACHE.HADOOP.HIVE.CLI.CLIDRIVER.PROCESSLOCALCMD (clidriver.java:259)
At ORG.APACHE.HADOOP.HIVE.CLI.CLIDRIVER.PROCESSCMD (clidriver.java:216)
At Org.apache.hadoop.hive.cli.CliDriver.processLine (clidriver.java:413)
At Org.apache.hadoop.hive.cli.CliDriver.run (clidriver.java:756)
At Org.apache.hadoop.hive.cli.CliDriver.main (clidriver.java:614)
At Sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
At Sun.reflect.NativeMethodAccessorImpl.invoke (nativemethodaccessorimpl.java:39)
At Sun.reflect.DelegatingMethodAccessorImpl.invoke (delegatingmethodaccessorimpl.java:25)
At Java.lang.reflect.Method.invoke (method.java:597)
At Org.apache.hadoop.util.RunJar.main (runjar.java:212)
Failed:parseexception line 1:12 missing TABLE @ ' local ' near ' local ' in SELECT clause
Line 1:18 cannot recognize input near ' directory ' '/home/wyp/wyp ' "SELECT ' in SELECT clause copy code
second, export to HDFs
As simple as importing data to a local file system, it can be implemented with the following statement:

  
hive> Insert Overwrite directory '/home/wyp/hdfs '
> select * from WYP; The copy code will save the data from the/home/wyp/hdfs directory in HDFs. Note that the hql of the exported file to the local file system is less than one, and the data storage path is not the same.

third, export to another table in hive

This is also the way hive data is imported, as follows:

 
hive> INSERT INTO table test
> Partition (age= ' 25 ')
> select ID, Name, tel
> from WYP;
#####################################################################
Here's a bunch of mapreduce task information, omitted here
#####################################################################
Total MapReduce CPU time spent:1 seconds 310 msec
Ok
Time taken:19.125 seconds

Hive> select * from test;
Ok
5 WYP1 131212121212 25
6 WYP2 134535353535 25
7 WYP3 132453535353 25
8 WYP4 154243434355 25
1 Wyp 13188888888888 25
2 Test 13888888888888 25
3 ZS 899314121 25
Time taken:0.126 seconds, Fetched:7 row (s) Copy code
The attentive reader may ask how to import data into a file, and why is the column delimiter between columns of data not WYP table set? In fact, between the hive 0.11.0, the data is exported can not specify the delimiter between the columns, only with the default column delimiter, that is, the above ^a to split, so the data is not intuitive, it seems very inconvenient.
If you are using a hive version of 0.11.0, you can specify the delimiter between columns when you export the data.

here is a detailed description:


In the Hive0.11.0 version, a new feature is introduced, that is, when the user outputs the hive query results to a file, the user can specify the column delimiter, and in the previous version it is not possible to specify the delimiters between the columns, which gives us a lot of constant, before the Hive0.11.0 version we generally use: hive> Insert overwrite local directory '/home/wyp/documents/result '
Hive> select * from test;

The file columns saved by the copy code are separated by ^a (\x01) 196^a242^a3
186^a302^a3
22^a377^a1
244^A51^A2 Copy code note, above is for display convenience, and will \x01 writing ^a, in the actual text editor we are not see ^a, but a strange symbol. Now we can introduce a new feature with the Hive0.11.0 version, specifying the delimiter between the output result columns: hive> insert overwrite local directory '/home/wyp/documents/result '
hive> row Format delimited
hive> fields terminated by ' \ t '
Hive> select * from test; Copy the code again to see the result of the output 196 242 3
186 302 3
22 377 1
244 51 2 Copy Code
The results look much better. If it is a map type, you can use the following statement to split the map key and value hive> insert overwrite local directory './test-04 '
hive> row Format delimited
hive> fields TERMINATED by ' \ t '
Hive> COLLECTION ITEMS TERMINATED by ', '
Hive> MAP KEYS TERMINATED by ': '
Hive> select * from SRC; Copy Code


Based on the above, let's proceed further:


hive> Insert overwrite local directory '/home/yangping.wu/local '
> Row Format delimited
> Fields terminated by ' \ t '
> select * from WYP; Copy code [wyp@master ~/local]$ vim 000000_0
5 WYP1 23 131212121212
6 WYP2 24 134535353535
7 WYP3 25 132453535353
8 WYP4 26 154243434355
1 WYP 25 13188888888888
2 Test 30 13888888888888
3 ZS 34 899314121 Copy Code
In fact, we can also use the-E and-f parameters of hive to export data. Where-e means that the SQL statement is followed directly with double quotation marks, and-f is a file, the contents of the file is an SQL statement, as follows:

  
[Wyp@master ~/local] [Discuz_code_26]nbsp; Hive-e "SELECT * from Wyp" >> local/wyp.txt
[Wyp@master ~/local] [Discuz_code_26]nbsp; Cat Wyp.txt
5 WYP1 23 131212121212
6 WYP2 24 134535353535
7 WYP3 25 132453535353
8 WYP4 26 154243434355
1 WYP 25 13188888888888
2 Test 30 13888888888888
3 ZS 34 899314121 The results of the copy code are also split with \ t. can also be implemented with the-F parameter:

[Wyp@master ~/local]$ Cat Wyp.sql
SELECT * FROM WYP
[Wyp@master ~/local]$ hive-f wyp.sql >> local/wyp2.txt Copy Code
The result of the above statement is also \ t split.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.