sqoop同步mysql到hdfs

來源:互聯網
上載者:User

標籤:sqoop1.997

連結:http://pan.baidu.com/s/1gfHnaVL 密碼:7j12

mysql-connector version 5.1.32

若在安裝版本過程遇到些問題,可參考http://dbspace.blog.51cto.com/6873717/1875955,其中一些問題的解決辦法

下載並安裝:

cd /usr/local/
tar -zxvf sqoop2-1.99.3-cdh5.0.0.tar.gz
mv sqoop2-1.99.3-cdh5.0.0 sqoop
添加sqoop2到系統內容變數中:
export SQOOP_HOME=/usr/local/sqoop
export CATALINA_BASE=$SQOOP_HOME/server
export PATH=$PATH:/usr/local/sqoop/bin
拷貝mysql驅動包到$SQOOP2_HOME/server/lib下
cp mysql-connector-java-5.1.32-bin.jar /usr/local/sqloop/server/lib/ 
修改設定檔:
vim /usr/local/sqoop/server/conf/sqoop.properties
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop/etc/hadoop#hadoop的設定檔路徑
vim /usr/local/sqoop/server/conf/catalina.properties
把原來58行注釋了,這裡主要配置了hadoop的jar包的路徑資訊
common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${ca    talina.home}/../lib/*.jar,/usr/local/hadoop/share/hadoop/common/*.jar,/usr/local/hadoop/share/hadoop/common/lib/*.    jar,/usr/local/hadoop/share/hadoop/hdfs/*.jar,/usr/local/hadoop/share/hadoop/hdfs/lib/*.jar,/usr/local/hadoop/shar    e/hadoop/mapreduce/*.jar,/usr/local/hadoop/share/hadoop/mapreduce/lib/*.jar,/usr/local/hadoop/share/hadoop/tools/*    .jar,/usr/local/hadoop/share/hadoop/tools/lib/*.jar,/usr/local/hadoop/share/hadoop/yarn/*.jar,/usr/local/hadoop/sh    are/hadoop/yarn/lib/*.jar
啟動\停止sqoop
/usr/local/sqoop/sqoop2-server start/stop
驗證是否啟動成功:
方式一:jps查看進程: Bootstrap 
[[email protected] sqoop]# jps

25505 SqoopShell
13080 SecondaryNameNode
12878 NameNode
26568 Jps
方式二:方式二:http://192.168.1.114:12000/sqoop/version #SQOOP預設使用的連接埠為12000在/usr/local/sqoop/server/conf/server.xml中進行設定

####接下來測試mysql到hadoop儲存的一個過程
1、用戶端登陸
[[email protected] bin]# sqoop2-shell
Sqoop home directory: /usr/local/sqoop
Sqoop Shell: Type ‘help‘ or ‘\h‘ for help.
sqoop:000>
2、建立一個mysql連結,在這個版本create 就只有[connection|job],注意不同版本的添加連結方式是不同的.
查看支援的連結服務
sqoop:000> show connector
+----+------------------------+-----------------+------------------------------------------------------+
| Id |          Name          |     Version     |                        Class                         |
+----+------------------------+-----------------+------------------------------------------------------+
| 1  | generic-jdbc-connector | 1.99.3-cdh5.0.0 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector |
+----+------------------------+-----------------+------------------------------------------------------+##在1.99.7的版本顯示的方式和服務更多。
sqoop:000> create connection --cid 1
Creating connection for connector with id 1
Please fill following values to create new connection object
Name: mysql_to_hadoop

Connection configuration

JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://192.168.1.107:3306/sqoop #這裡需要在1.107先添加好庫sqoop
Username: sqoop##需要在資料庫添加好連結的使用者
Password: *******
JDBC Connection Properties:
There are currently 0 values in the map:
entry#
Security related configuration options
Max connections:
New connection was successfully created with validation status ACCEPTABLE and persistent id 2
2、建立job
sqoop:000> create job --xid 2 --type import##注意 --xid 2為連結的id號
Creating job for connection with id 2
Please fill following values to create new job object
Name: mysql_to_hadoop

Database configuration

Schema name: sqoop#MySQL的庫名
Table name: wangyuan#庫下的表
Table SQL statement:
Table column names:
Partition column name:
Nulls in partition column:
Boundary query:

Output configuration

Storage type:
 0 : HDFS
Choose: 0
Output format:
 0 : TEXT_FILE
 1 : SEQUENCE_FILE
Choose: 0
Compression format:
 0 : NONE
 1 : DEFAULT
 2 : DEFLATE
 3 : GZIP
 4 : BZIP2
 5 : LZO
 6 : LZ4
 7 : SNAPPY
Choose: 0
Output directory: hdfs://192.168.1.114:9000/home/mysql_to_hdfs2
#注意這個mysql_to_hdfs不能再hadoop的/home/已經存在的,但/home路徑要存在,9000連接埠是在配置hadoop的時候配置,根據實際,或者通過WEB查看http:ip:50070----顯示Overview ‘mycat:9000‘ (active)
建立hdfs路徑/usr/local/hadoop/bin/hadoop fs -mkidr /home
查看建立目錄:/usr/local/hadoop/bin/hadoop fs -ls /home 或者通過WEB查看http:ip:50070
Throttling resources
Extractors:
Loaders:
New job was successfully created with validation status FINE  and persistent id 2
sqoop:000>
啟動job
sqoop:000> start job --jid 2
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception
根本不知道這個提示說什麼,通過修改設定:
set option --name verbose --value true
sqoop:000> start job --jid 2        
Submission details
Job ID: 2
Server URL: http://localhost:12000/sqoop/
Created by: root
Creation date: 2016-11-23 21:15:27 CST
Lastly updated by: root
External ID: job_1479653943050_0007
http://haproxy:8088/proxy/application_1479653943050_0007/
Connector schema: Schema{name=wangyuan,columns=[
FixedPoint{name=id,nullable=null,byteSize=null,unsigned=null},
Date{name=c_time,nullable=null,fraction=null,timezone=null}]}
2016-11-23 21:15:27 CST: BOOTING  - Progress is not available
返回這樣資訊OK
查看結果通過WEB

650) this.width=650;" src="http://s2.51cto.com/wyfs02/M01/8A/9E/wKioL1g1mbPR2z7uAABQbA5yHps688.png-wh_500x0-wm_3-wmp_4-s_241985364.png" title="222.png" alt="wKioL1g1mbPR2z7uAABQbA5yHps688.png-wh_50" />

/usr/local/hadoop/bin/hadoop fs -ls /home/

本文出自 “DBSpace” 部落格,請務必保留此出處http://dbspace.blog.51cto.com/6873717/1875971

sqoop同步mysql到hdfs

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.