Because of the needs of the work, need to transfer the data in HDFs to the relational database to become the corresponding table, on the Internet to find the relevant data for a long, found that different statements, the following is my own test process:
To use Sqoop to achieve this need, first understand what Sqoop is.
Sqoop is a tool used to transfer data from Hadoop and relational databases to each other in a relational database (e.g. MySQL, Oracle, Postgres, etc.) into the HDFs of Hadoop. HDFs data can also be directed into a relational database.
First, you need to prepare the following:
First: The Namenode node of Hadoop has the corresponding database-driven jar package and Sqoop jar package in the Lib folder.
Second: Create a table in the appropriate database beforehand, note: The data format on one of the HDFS directories is the same as the number of fields in the corresponding table.
Because I'm using an Oracle database and I'm using Java to manipulate it. So the following code and screenshots are examples of Java:
First standardize the file format in HDFs, as shown in the following figure:
The Java code is as follows:
Configuration conf = new configuration ();
Conf.set ("Fs.default.name", "hdfs://192.168.115.5:9000");
Conf.set ("Hadoop.job.ugi", "Hadooper,hadoopgroup");
Conf.set ("Mapred.job.tracker", "192.168.115.5:9001");
arraylist<string> list = new arraylist<string> (); Define a list
List.add ("--table");
List.add ("A_baat_client"); A table in Oracle. Future data will be imported into this table.
List.add ("--export-dir");
List.add ("/home/hadoop/traffic/capuse/near7date/activeuser/capuse_near7_activeuser_2013-02-06.log"); The directory on the HDFs. The data in this directory is to be imported into the A_baat_client table.
List.add ("--connect");
List.add ("Jdbc:oracle:thin:@10.18.96.107:1521:life"); Links to Oracle
List.add ("--username");
List.add ("Traffic"); User name for Oracle
List.add ("--password");
List.add ("Traffic"); Oracle's password
List.add ("--input-fields-terminated-by");
List.add ("|"); Data separator Symbol
List.add ("-M");
List.add ("1");//defines the number of MapReduce.
string[] arg = new string[1];
Exporttool exporter = new Exporttool ();
Sqoop Sqoop = new Sqoop (exporter);
sqoop.setconf (conf);
arg = List.toarray (new string[0]);
int result = Sqoop.runsqoop (Sqoop, ARG);
System.out.println ("Res:" + result); Print execution results.
Finally, run in the main method, and the resulting table data is shown in the following figure:
Through the above operation and code can be implemented in Java to the HDFS data generated corresponding table data;
However, in addition to being able to implement it in Java, it is possible to use basic commands, as follows:
In the Hadoop bin directory:
sqoop export--connect jdbc:oracle:thin:@10.18.96.107:1521:life \
--table a_baat_client--username traffic--password traffic \
--input-fields-terminated-by ' | ' \
--export-dir/home/hadoop/traffic/capuse/near7date/activeuser/test.log-m 1
The meaning is the same as the code in Java above.
Note:
1, the database table name, user name, password use uppercase (this may cause problems, because I in the test process, the use of lowercase errors, there is no columns this classic error. So it is recommended to capitalize, of course this is not required);
2, pre-built the corresponding table;
Well, the above code is actually very simple, but if it is never contacted, then in the process of doing will find a lot of problems, and the information on the Internet is very complex, this person in this article is to make a memo for themselves, but also want to give the need of a friend a help. Of course, there may be a lot of problems, look at the master treatise ...