Why is SQOOP1.99.7 called SQOOP2?
At this time the Apache official website of SQOOP2 Doc
Http://sqoop.apache.org/docs/1.99.7/user/CommandLineClient.html#delete-link-function
Because recently may want to do some Hadoop hbase practice, because previously did not have the actual contact, therefore is very ignorant, the distributed environment already by the company's transportation colleague constructs well, after familiar with, The priority now is to import a single table of data in the RDBMS relational database of more than 70 million to the hbase, but through the query, it is found that SQOOP2 has not supported directly from the RDBMS into the hbase, need to go through the HDFs secondary school, SQOOP1.99.7 is the latest version of Apache, starting my road to the pit:
SQOOP1.99.7 has client-side and server-side 2 servers,
1. Start Sqoop SERVER First:
Bin/sqoop2-server after start enter command: JPS
SQOOPJETTYSERVER:SQOOP2 's server process, which is what we need
Namenode,secondarynamenode: It's the process of Hadoop Namenode, and maybe people will ask why there is no datanode, because Datanode process on the other 2 Hadoop machines
Hmaster:hbase the main process
, we'll open a new shell window and use client to connect to the server
Enter command:
Set option--name verbose--value True the command seems to be to print some more information to help us troubleshoot errors.
Then to connect the server side of Sqoop, the official website to the example is like this:
Set server--host sqoop2.company.net--port--webapp sqoop
Need to be replaced by our own:
Sqoop:000> Set server--host hadoop-node02--port 12000--webapp sqoop
Server is set successfully
sqoop:000 >
Prove that we have connected, the first step to success.
Sqoop import data into HDFs, you need to create 2 link,1 job,2 link is MySQL link,hdfs link,job is the job from MySQL to HDFs
First we look at the type of connection Sqoop supports
The first step: Create a MySQL link, input command: Create link--connector generic-jdbc-connector,
Generic-jdbc-connector is the name in the connector above
After the carriage, we will see
This name is our own MySQL Link name, here we enter MYSQLLINK2, the official online example is the one, the middle has a space, I tried, but there are spaces can be created, but the total error is deleted, I do not know what is going on,
See, our creation did not succeed, this time will be the information you entered to print out, and then let you confirm each item, if we confirm the press Enter the next
We found our driver package was wrong and wrote a class in front:
After we changed, we succeeded in creating a link link with the name MYSQLLINK2
At this point, the MySQL connection we have created successfully,
Below, we create the HDFs connection, ditto
Input command: Create link--connector hdfs-connector
Name we enter: HDFSLINK2
URI: We entered the core-site.xml fs.defaultfs node in the Hadoop configuration file, which I am configuring here hdfs://hadoop-node01:9000
Conf Directory: We entered a directory of Hadoop's configuration files, which I typed here:/usr/local/hadoop/etc/hadoop
Created successfully.
At this point, the MySQL link and HDFs Link has been created successfully, you can use show Link-all to query all the link information
Here we create a job to transfer data
Do not fill in all are optional, later pondering it, so that our job to create a successful,
This time we enter the command show job to see the job information
Let's take a look at the job below.
sqoop:000> Start Job--name testjob2
Well, the execution worked, and we went to the HDFs file in Hadoop to look at the