Note: The following information refer to the teacher Dylan
What is a sqoop?
Sqoop is an open source tool, Sqoop SQL to Hadoop, used primarily in Hadoop (Hive) and traditional databases (MySQL, PostgreSQL ...) Data transfer, the development of the main evolution of the two major editions, SQOOP1 and SQOOP2.
Second Why C
Tags: hadoop cluster mysql HiveI. Description of the BusinessUsing HADOOP2 and other open source frameworks, the local log files are processed and the data required after processing (PV, UV ... Re-import into the relational database (MYSQL), using Java programs to process the result data, organized into a report form in the data behind the display.Second, why use
What is Sqoop? Sqoop is a tool used to migrate data from Hadoop and RDBMS (MySQL, Oracle, Postgres) to each other, and he can import RDBMS data into HDFs, or it can export HDFS data into an RDBMS.sqoop principle? One of the highlights of
Because of the needs of the work, need to transfer the data in HDFs to the relational database to become the corresponding table, on the Internet to find the relevant data for a long, found that different statements, the following is my own test process:
To use Sqoop to achieve this need, first understand what Sqoop is
1.sqoop importing data from hive into MySQLFor example:Sqoop export--connect jdbc:mysql://10.18.101.15:3306/wda--username restdbuser--password 123456--table adl_trend_num _android--export-dir/apps/hive/warehouse/adldb.db/adl_trend_num_android/date_stamp= $date-- Input-fields-terminated-by ' \ t '2.sqoop importing data
Label:The first is to install the MySQL database. Installation is complete using the sudo apt-get install mysql-server command. The table is then created and the data is inserted:Then download the Sqoop and the jar package that connects to the MySQL database. The next step is to install Sqoop. The first is to configure the sq
/hive/warehouse/data_w.db/seq_fdc_jplp --columns goal_ocityid,goal_issueid,compete_issueid,ncompete_rank --input-fields-terminated-by '\001' --input-lines-terminated-by '\n'
Be sure to specify the-columns parameter. Otherwise, an error will be reported and the columns cannot be found.Usage:-columns
Check whether data is imported successfully.
?sqoop eval --connect jdbc:oracle:thin:@localhost:p
1. Install sqoop
Download sqoop-1.2.0.tar.gz (version 1.20 is compatible with Hadoop0.20)
Put the hadoop-core-0.20.2-cdh3u3.jar, hadoop-tools-0.20.2-cdh3u3.jar into the sqoop/lib directory, the two jar packages are out of cloudera company, you can go to its official website to download.
2. import data from mysql
Go to
Tags: hiveOne, sqoop in synchronizing MySQL table structure to hiveSqoop create-hive-table--connect jdbc:mysql://ip:3306/sampledata--table t1--username Dev--password 1234--hive-table T1;Execution to this step exits, but in Hadoop's HDFs/hive/warehouse/directory is not found T1 table directory,But the normal execution is done as follows:The error is that Hive's jar package is missing.All of the jar packages should be like this:This is all the hadoop-2.
One: Two ways to Sqoop incremental importIncremental Import Arguments:
Argument
Description
--check-column (col)
Specifies the column to is examined when determining which rows to import. (the column should not being of type Char/nchar/varchar/varnchar/longvarchar/longnvarchar)
--incremental (mode)
Specifies how Sqoop determines
Tags: lin replace tell database hang CAs install prompt relationshipThe sqoop1.4.6 how to import MySQL data into the Sqoop installation of Hadoop is described in the previous article , and the following is a simple use command for data interoperability between the two. Display MySQL database information, General Sqoop
Tags: DSL java style order man LAN 2.7 CLI policyObjectiveThis article is primarily a summary of the pits that were encountered when importing data from MySQL to hive with Sqoop. Environment:
System: Centos 6.5
hadoop:apache,2.7.3
mysql:5.1.73
jdk:1.8
sqoop:1.4.7
Hadoop runs in pseudo-distributed mode. One, the import command usedI
Label:# #以上完成后在h3机器上配置sqoop -1.4.4.bin__hadoop-2.0.4-alpha.tar.gzImporting the data from the users table in the MySQL test library on the host computer into HDFs, the default Sqoop 4 map runs mapreduce for import into HDFs, stored in the HDFs path to/user/root/users (User: Default Users, Root:mysql database user, test: Table name) directory with four output files
. jobclient:failed Map Tasks=114/07/19 12:04:33 INFO mapred. jobclient:launched Map tasks=414/07/19 12:04:33 INFO mapred. Jobclient:total time spent by all maps in occupied slots (ms) =3477414/07/19 12:04:33 INFO mapred. Jobclient:total time spent by all reduces in occupied slots (ms) =014/07/19 12:04:33 INFO mapred. Jobclient:total time spent by all maps waiting after reserving slots (ms) =014/07/19 12:04:33 INFO mapred. Jobclient:total time spent by all reduces waiting after reserving slots (m
Let me tell you, Big Data engineers have an annual salary of more than 0.5 million and a technical staff gap of 1.5 million. In the future, high-end technical talents will be snapped up by enterprises. Big Data is aimed at higher talent scarcity, higher salaries, and higher salaries. Next, we will analyze the
Label:Sqoop the data from MySQL to Hive and reported that the database access was denied. But the weird is, sqoop error is prompted to connect to the local MySQL was rejected, is not prompted to connect the target data MySQL is denied. I also connected to the zookeeper, will also be prompted to connect all the zookeeper host MySQL is denied. Log as below. In fact
import data from the same table more than once, the data is inserted into the HDFs directory as append.Parallel importAssuming this sqoop command, you need to import data from Oracle into HDFS: Sqoop import--append--connect $CONNECTURL--username $ORACLENAME--password $ORACL
test: with Oozie, execute Sqoop action to import data from DB2 into the Hive partition table. Places to be aware of:1, to add hive.metastore.uris this parameter. Otherwise, the data cannot be loaded into the hive table. Also, if there is more than one such operation in an XML literal, this parameter needs to be configured in each action.2, be aware of the escape
data_type
category_id bigint
category_name string
category_level int
default_import_categ_prior int
user_import_categ_prior int
default_eliminate_categ_prior int
user_eliminate_categ_prior int
update_time string The fields of the hive table are separated by \001, the rows are separated by \ n, and the empty fields are filled in \ n . Now you need to export the hive table pms
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.