Mysql/oracle and Hdfs/hbase Mutual data via Sqoop implementation

Source: Internet
Author: User
Tags zookeeper sqoop

Mysql/oracle and Hdfs/hbase mutual data via Sqoop

The following will focus on the implementation of MySQL and HDFS interoperability data through Sqoop, and the mutual guidance between MySQL and Hbase,oracle and HBase gives the final command.
One, MySQL and HDFS Mutual guidance data
Environment:
Host machine operating system for Win7,mysql installed on host, host address is 192.168.66.96
3 Virtual machine operating systems are ubuntu-12.04.1-32 bit
Three virtual machines have successfully installed Hadoop and implemented key-free exchange visits with hosts:
192.168.66.91 Masternode
192.168.66.92 Slavenode1
192.168.66.93 Slavenode2
/etc/profile the required environment variables are configured Hadoop_home,java_home
The experiment was performed on the Masternode and the MySQL was successfully connected

Step one, download the installation package:
The version of Hadoop that I installed is native Hadoop-0.20.203.0,sqoop not supported, can be used with CDH3 version of Hadoop, or it can be used by copying the appropriate package to sqoop-1.2.0-cdh3b4/lib.
Download Related documents:
Http://archive.cloudera.com/cdh/3/hadoop-0.20.2-CDH3B4.tar.gz
Http://archive.cloudera.com/cdh/3/sqoop-1.2.0-CDH3B4.tar.gz
SQOOP-1.2.0-CDH3B4 relies on Hadoop-core-0.20.2-cdh3b4.jar, so you need to download the hadoop-0.20.2-cdh3b4.tar.gz and unzip it to HADOOP-0.20.2-CDH3B4 The/hadoop-core-0.20.2-cdh3b4.jar is copied to the Sqoop-1.2.0-cdh3b4/lib.
In addition, Sqoop import MySQL data is dependent on Mysql-connector-java-*.jar during operation, so you need to download Mysql-connector-java-*.jar and copy to sqoop-1.2.0-cdh3b4/ Lib.

Step two, related configuration:
Modify the Sqoop file Configure-sqoop, comment out HBASE and zookeeper check (unless you are ready to use components on Hadoop such as habase) #if [!-d "${hbase_home}"]; Then
# echo "Error: $HBASE _home does not exist!"
# echo ' Please set $HBASE _home to the root of your HBASE installation. '
# exit 1
#fi
#if [!-d "${zookeeper_home}"]; Then
# echo "Error: $ZOOKEEPER _home does not exist!"
# echo ' Please set $ZOOKEEPER _home to the root of your ZOOKEEPER installation. '
# exit 1
#fi

Modify the/etc/profile environment variable file (after Su root, sudo gedit/etc/profile):
Add Export Sqoop_home=/home/grid/sqoop
Added after the original path: $SQOOP _home/bin

Step three, build the test user in MySQL, test the tables and data, and test the Sqoop connection to MySQL:
Create user Sqoop and authorize:
Grant all privileges on * * to ' sqoop ' @ ' percent ' identified by ' sqoop ' with GRANT option;

Create a tablespace (schema) Sqoop and create a test table:
CREATE TABLE Students (
ID int NOT NULL PRIMARY key,
Name varchar (20),
Age int)

Insert test data:
INSERT into students values (' 10001 ', ' Liyang ', 29);
INSERT into students values (' 10002 ', ' lion ', 28);
INSERT into students values (' 10003 ', ' Leon ', 26);

In the Masternode test, the Sqoop can successfully connect to MySQL on the host machine:
Sqoop list-tables--connect jdbc:mysql://192.168.66.96:3306/sqoop--username sqoop--password sqoop
If the students table in the Sqoop table space is displayed correctly, Sqoop is already able to connect successfully mysql!

Step four, import three rows of data from the students table in MySQL sqoop table space into HDFs:
Start Hadoop:
start-all.sh
Verify that the boot was successful with JPS
shows that the following process is running as a startup success:
2820 Secondarynamenode
4539 Jps
2887 Jobtracker
2595 NameNode

To import data from MySQL, run the following command:
Sqoop import--connect jdbc:mysql://192.168.66.96:3306/sqoop--username sqoop--password sqoop--table students-m 1

Verify that the import data is successful:
If the import succeeds, running Hadoop Dfs-ls will be able to see a folder named after the root directory/user/grid/with the table name students
Run Hadoop dfs-ls/user/grid/students to see the files that are included in the folder:/user/grid/students/part-m-00000
Running Hadoop dfs-cat/user/grid/students/part-m-00000 will see that the file already contains three rows of data from the students table in MySQL:
10001,liyang,29
10002,lion,28
10003,leon,26

Step five, import the data in HDFs into the MySQL students table:
First, the MySQL students table data is emptied:
Delete from students;

Then execute the Export Data command on Masternode:
Sqoop export--connect jdbc:mysql://192.168.66.96:3306/sqoop--username sqoop--password sqoop--table students--export -dir hdfs://masternode:9000/user/grid/students/part-m-00000

If successful, you will see the data in the students table restored in MySQL!

Note: The process may be due to Slavenode 50010 Port does not open and error, need to root user through the sudo ufw allow 50010 command to open the port!

Second, MySQL and HBase Mutual guidance data
The command format for importing MySQL data into HBase is:
Sqoop import--connect jdbc:mysql://mysqlserver_ip/databasename--username--password Password--table DataTable-- Hbase-create-table--hbase-table hbase_tablename--column-family col_fam_name--hbase-row-key key_col_name
Where DatabaseName and DataTable are MySQL database and table names, Hbase_tablename is the table name to be used in HBase, Key_col_name can specify which column in the DataTable is The rowkey,col_fam_name of the new table for HBase is the column family name for all columns except Rowkey

For example, the students table in MySQL can be imported into hbase with the following command:
/home/grid/sqoop/bin/sqoop import--connect jdbc:mysql://192.168.66.96/sqoop--username sqoop--password liyang16-- Table Students--hbase-create-table--hbase-table students--column-family stuinfo--hbase-row-key ID
After successful execution, you can view the results in hbase with the following command:
HBase (main):011:0> scan ' students '
ROW Column+cell
10001 Column=stuinfo:age, timestamp=1356759994058, value=29
10001 Column=stuinfo:name, timestamp=1356759994058, Value=liyang
10002 column=stuinfo:age, timestamp=1356760044478, value=28
10002 column=stuinfo:name, timestamp=1356760044478, value=lion
10003 column=stuinfo:age, timestamp=1356760044478, value=26
10003 column=stuinfo:name, timestamp=1356760044478, Value=leon
3 row (s) in 0.0900 seconds

Iii. Oracle and HBase Interoperability data
Export the Dept table in Oracle (column ID,NAME,ADDR) to the Dept table in HBase (row key is ID, column family is deptinfo)
Sqoop import--append--connect jdbcracle:thin:@192.168.66.90:1521rcl--username Test--password Test--m 1--table Dept- -columns id,name,addr--hbase-create-table--hbase-table dept--hbase-row-key ID--column-family deptinfo

Implementing Mysql/oracle and Hdfs/hbase data through Sqoop

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.