Operating Environment CentOS 5.6 Hadoop Hive
Sqoop is a tool developed by the Clouder company that enables Hadoop technology to import and export data between relational databases and hdfs,hive.
Shanghai still school Hadoop Big Data Training Group original, there are hadoop big Data technology related articles, please pay more attention!
Problems you may encounter during use:
- Sqoop relies on zookeeper, so zookeeper_home must be configured in the environment variable.
- SQOOP-1.2.0-CDH3B4 relies on Hadoop-core-0.20.2-cdh3b4.jar, so you need to download the hadoop-0.20.2-cdh3b4.tar.gz and unzip it to hadoop-0.20.2-cdh3b4/ The Hadoop-core-0.20.2-cdh3b4.jar is copied to the Sqoop-1.2.0-cdh3b4/lib.
1 Install Sqoop First, if you are using the Clouder distribution, it is very simple.
# yum Install Sqoop
If you use the official version,
# CD/ETC/YUM.REPOS.D
# wget Http://archive.cloudera.com/redhat/cdh/cloudera-cdh3.repo
# yum-y Install Sqoop
Sqoop will be installed and completed.
2 Using Sqoop
First copy the Mysql-connector-java-5.1.16-bin.jar file to the/usr/lib/sqoop/lib folder
3 Import and Export database
1) List all database commands in the MySQL database
# sqoop list-databases--connect jdbc:mysql://localhost:3306/--username root--password 123456
2) Connect MySQL and list the table commands in the database
# sqoop List-tables--connect jdbc:mysql://localhost:3306/test--username root--password 123456
The test in the command is the name of the test database in the MySQL database username password The user password for the MySQL database, respectively
3) Copy the table structure of the relational data into hive
Sqoop create-hive-table--connect jdbc:mysql://localhost:3306/test--table username--username root--password 123456-- Hive-table Test
Where--table username is the table in the database test in MySQL--hive-table test is the new table name in hive
4) Import files from a relational database into hive
Sqoop Import--connect jdbc:mysql://localhost:3306/test--username root--password mysql-password--table T1-- Hive-import
5) Import table data from hive into MySQL
./sqoop export--connect jdbc:mysql://localhost:3306/test--username root--password admin--table uv_info--export-dir/ user/hive/warehouse/uv/dt=2011-08-03
If an error
11/08/05 10:51:22 INFO mapred. Jobclient:running job:job_201108051007_0010
11/08/05 10:51:23 INFO mapred. Jobclient:map 0% Reduce 0%
11/08/05 10:51:36 INFO mapred. Jobclient:task Id:attempt_201108051007_0010_m_000000_0, status:failed
Java.util.NoSuchElementException
At Java.util.abstractlist$itr.next (abstractlist.java:350)
At Uv_info.__loadfromfields (uv_info.java:194)
At Uv_info.parse (uv_info.java:143)
At Com.cloudera.sqoop.mapreduce.TextExportMapper.map (textexportmapper.java:79)
At Com.cloudera.sqoop.mapreduce.TextExportMapper.map (textexportmapper.java:38)
At Org.apache.hadoop.mapreduce.Mapper.run (mapper.java:144)
At Com.cloudera.sqoop.mapreduce.AutoProgressMapper.run (autoprogressmapper.java:187)
At Org.apache.hadoop.mapred.MapTask.runNewMapper (maptask.java:647)
At Org.apache.hadoop.mapred.MapTask.run (maptask.java:323)
At Org.apache.hadoop.mapred.child$4.run (child.java:270)
At java.security.AccessController.doPrivileged (Native Method)
At Javax.security.auth.Subject.doAs (subject.java:396)
At Org.apache.hadoop.security.UserGroupInformation.doAs (usergroupinformation.java:1127)
At Org.apache.hadoop.mapred.Child.main (child.java:264)
The reason for this error is that the field of the Sqoop parsing file does not correspond to the field of the table in the MySQL database. Therefore, it is necessary to add parameters to the Sqoop at execution time, telling the Sqoop file delimiter so that it can parse the file fields correctly.
Hive Default Field delimiter is ' \001 '
./sqoop export--connect jdbc:mysql://localhost:3306/datacenter--username root--password admin--table uv_info-- export-dir/user/hive/warehouse/uv/dt=2011-08-03--input-fields-terminated-by ' \ t '
How to use Sqoop to import the hive data into the exported data to MySQL