Alex's Novice Hadoop Tutorial: Lesson 9th Sqoop1 Exporting MySQL from hbase or hive

Source: Internet
Author: User
Tags deprecated hdfs dfs sqoop accumulo

Tell me today how to use Sqoop to export hbase or hive to MySQL. But I have to tell you beforehand.


Currently Sqoop has no way to export data directly from HBase to MySQL. 2 tables must be built through hive, one external table is based on the HBase table, the other is a simple HDFs-based hive native table, then the data from the external table is imported into the native table (temporary), and then the data inside the temporary table is exported to MySQL via hive


Data preparation MySQL build empty table
CREATE TABLE ' employee ' (   ' rowkey ' int (one) not NULL,  ' id ' int (one) ' not ' null ',  ' name ' varchar (a) ' NOT NULL,      P Rimary KEY (' id ')    ) Engine=myisam  



Note: Because of the habitual use of the hive table used to map HBase's Rowkey field named key, it is possible to build a MySQL table in the corresponding key field, but key is a reserved word for MySQL, An issue that causes the INSERT statement to fail to insert
HBase establishes an employee table to create an employee table and inserts data
HBase (main):005:0> create ' employee ', ' info ' 0 row (s) in 0.4740 seconds=> hbase::table-employeehbase (main): 006:0& Gt                                      Put ' employee ', 1, ' Info:id ', ten row (s) in 0.2080 secondshbase (main):008:0> scan ' employee ' row                                                                                                            Column+cell                                                                      1 Column=info:id, timestamp=1417591291730, value=1 1 row (s) in 0.0610 secondshbase (main):009:0> put ' employee ', 1, ' info:name ', ' p                                                                                                            Eter ' 0 row (s) in 0.0220 secondshbase (main):010:0> scan ' employee ' row Column+cell                                       1                                                                       Column=info:id, timestamp=1417591291730, value=1 1                                      Column=info:name, timestamp=1417591321072, Value=peter 1 row (s) in 0.0450 secondshbase (main):011:0> put ' employee ', 2, ' Info:id ', row (s) in 0.0370 secondshbase (main):012:0> put ' employee ', 2, ' info:name ', ' Paul ' 0 row (s) in 0.0180 secondshbase (main): 013:0                                                                                                            > Scan ' employee ' ROW Column+cell 1 Column=info:id, Timestam                                       p=1417591291730, value=1 1                                                                 Column=info:name, timestamp=1417591321072, Value=peter                                                        2 Column=info:id, timestamp=1417591500179, value=2               2 Column=info:name, timestamp=1417591512075, Value=paul 2 row (s) in 0.0440 seconds

The Hive External table hive is divided into native tables and external tables, native tables are stored in HDFs as simple files, external tables depend on other frameworks, such as HBase, and we now create a hive external table that relies on the employee HBase table we just created
hive> CREATE EXTERNAL TABLE h_employee (key int, id int, name string)     > STORED by ' org.apache.hadoop.hive.hbase.h Basestoragehandler '    > with serdeproperties ("hbase.columns.mapping" = ": Key, Info:id,info:name")    > Tblproperties ("hbase.table.name" = "employee"), Oktime taken:0.324 secondshive> select * from H_employee;o K11peter22paultime taken:1.129 seconds, Fetched:2 row (s)

Create hive native Table This hive native table is only used temporarily for export, so named H_employee_export, the delimiter between the fields is comma
CREATE TABLE h_employee_export (key int, id int, name STRING) ROW FORMAT delimited fields TERMINATED by ' \054 ';

Let's go and see what the actual stored text file looks like.
$ HDFs Dfs-cat/user/hive/warehouse/h_employee_export/000000_01,1,peter2,2,paul


SOURCE Hive table Import data to temp table
The first step is to import the data from the H_employee (HBase-based external table) to the H_employee_export (Native hive table)
hive> Insert Overwrite table h_employee_export select * from H_employee;
Hive> SELECT * from H_employee_export;ok11peter22paultime taken:0.359 seconds, Fetched:2 row (s)

Let's go see what the actual stored text file looks like.
$ HDFs Dfs-cat/user/hive/warehouse/h_employee_export/000000_01,1,peter2,2,paul



Export data from hive to MySQL
$ sqoop Export--connect jdbc:mysql://localhost:3306/sqoop_test--username root--password root--table employee--m 1--e Xport-dir/user/hive/warehouse/h_employee_export/warning:/usr/lib/sqoop/. /hive-hcatalog does not exist! Hcatalog jobs would fail. Please set $HCAT _home to the root of your hcatalog installation. Warning:/usr/lib/sqoop/. /accumulo does not exist! Accumulo imports would fail. Please set $ACCUMULO _home to the root of your accumulo installation.14/12/05 08:49:35 INFO sqoop. Sqoop:running Sqoop version:1.4.4-cdh5.0.114/12/05 08:49:35 WARN tool. Basesqooptool:setting your password on the command-line is insecure. Consider Using-p instead.14/12/05 08:49:35 INFO Manager. Mysqlmanager:preparing to use a MySQL streaming resultset.14/12/05 08:49:35 INFO tool. Codegentool:beginning code generation14/12/05 08:49:36 INFO Manager. Sqlmanager:executing SQL statement:select t.* from ' employee ' as T LIMIT 114/12/05 08:49:36 INFO Manager. Sqlmanager:executing SQL Statement:select T.* FROM ' employee ' as-t LIMIT 114/12/05 08:49:36 INFO orm. Compilationmanager:hadoop_mapred_home Is/usr/lib/hadoop-mapreducenote:/tmp/sqoop-wlsuser/compile/ D16eb4166baf6a1e885d7df0e2638685/employee.java uses or overrides a deprecated API. Note:recompile with-xlint:deprecation for details.14/12/05 08:49:39 INFO ORM. compilationmanager:writing jar File:/tmp/sqoop-wlsuser/compile/d16eb4166baf6a1e885d7df0e2638685/employee.jar14/ 12/05 08:49:39 INFO MapReduce. Exportjobbase:beginning export of employee14/12/05 08:49:41 INFO Configuration.deprecation:mapred.jar is deprecated. Instead, use mapreduce.job.jar14/12/05 08:49:43 INFO configuration.deprecation: Mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative14/12/05 08:49:43 INFO configuration.deprecation: Mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative14/12/05 08:49:43 INFO Configuration.deprecation:mapred.map.tasks is deprecated . Instead, use MapreDuce.job.maps14/12/05 08:49:43 INFO Client. Rmproxy:connecting to ResourceManager at hadoop01/192.111.78.111:803214/12/05 08:49:45 INFO input. Fileinputformat:total input paths to process:114/12/05 08:49:45 INFO input. Fileinputformat:total input paths to process:114/12/05 08:49:45 INFO MapReduce. Jobsubmitter:number of splits:114/12/05 08:49:46 INFO MapReduce. Jobsubmitter:submitting tokens for job:job_1406097234796_003714/12/05 08:49:46 INFO impl. yarnclientimpl:submitted application application_1406097234796_003714/12/05 08:49:46 INFO mapreduce. Job:the URL to track the job:http://hadoop01:8088/proxy/application_1406097234796_0037/14/12/05 08:49:46 INFO Mapreduce. Job:running job:job_1406097234796_003714/12/05 08:49:59 INFO mapreduce. Job:job job_1406097234796_0037 running in Uber mode:false14/12/05 08:49:59 INFO MapReduce. Job:map 0% reduce 0%14/12/05 08:50:10 INFO mapreduce. Job:map 100% reduce 0%14/12/05 08:50:10 INFO mapreduce. Job:job job_1406097234796_0037 Completed Successfully14/12/05 08:50:10 INFO MapReduce. Job:counters:30file System Countersfile:number of bytes Read=0file:number of bytes Written=99761file:number of Read O Perations=0file:number of large read operations=0file:number of write operations=0hdfs:number of bytes Read=166hdfs:nu Mber of bytes Written=0hdfs:number of read Operations=4hdfs:number of large read Operations=0hdfs:number of write Opera Tions=0job Counters launched map tasks=1data-local map tasks=1total time spent by all maps in occupied slots (ms) =8805tota L time spent by all reduces in occupied slots (ms) =0total time spent by all map tasks (ms) =8805total Vcore-seconds taken B  Y all maps tasks=8805total megabyte-seconds taken by all map tasks=9016320map-reduce frameworkmap input Records=2map output  Records=2input split bytes=144spilled records=0failed shuffles=0merged Map outputs=0gc time Elapsed (ms) =97CPU time spent (ms) =1360physical memory (bytes) snapshot=167555072virtual memory (bytes) Snapshot=684212224total committed heap Usage (bytes) =148897792file Input format Counters bytes read=0file Output format Counters bytes written=014/ 12/05 08:50:10 INFO MapReduce. Exportjobbase:transferred 166 bytes in 27.0676 seconds (6.1328 bytes/sec) 14/12/05 08:50:10 INFO MapReduce. Exportjobbase:exported 2 records.




Note
There's a sentence in this journal
14/12/05 08:49:46 INFO MapReduce. Job:the URL to track the job:http://hadoop01:8088/proxy/application_1406097234796_0037/

This means you can use the browser to access the address to see the implementation of the task, if your task for a long time the card master is not finished is wrong, you can go to this address to see the detailed error log

View Results
Mysql> SELECT * from employee;+--------+----+-------+| Rowkey | ID | Name  |+--------+----+-------+|      1 |  1 | Peter | |      2 |  2 | Paul  

Import succeeded

Alex's Novice Hadoop Tutorial: Lesson 9th Sqoop1 Exporting MySQL from hbase or hive

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.