One: Two ways to Sqoop incremental import
Incremental Import Arguments:
| Argument |
Description |
--check-column (col) |
Specifies the column to is examined when determining which rows to import. (the column should not being of type Char/nchar/varchar/varnchar/longvarchar/longnvarchar) |
--incremental (mode) |
Specifies how Sqoop determines which rows is new. Legal values for mode include append and lastmodified . |
--last-value (value) |
Specifies the maximum value of the check column from the previous import. |
The--increamental Append method used here is to be aware that the primary key or SPLIT-COLUNM is incremented, otherwise it is recommended to add a createtime field to the relational table, using LastModified mode.
Two: Shell script
1#!/bin/SH2Export sqoop_home=/usr/share/sqoop-1.4.43 hostname="192.168.1.199"4User="Root"5password="Root"6Database="Test"7table="Tags"8curr_max=09 Ten functiondb_to_hive () { One A${sqoop_home}/bin/sqoop Import--connect Jdbc:mysql://${hostname}/${database}--username ${user}--password ${password}--table ${table}--split-by docid--hive-import- -hive-table lan.ding ---fields-terminated-by'\ t'--incremental Append--check-column docid-- Last-Value ${curr_max} -result= ' mysql-h${hostname}-u${user}-p${password} ${database}<<EOF the SelectMax (docid) from ${table}; - EOF ' -Curr_max= 'Echo$result |awk '{print $}'` - } + - if[$#-eq0]; Then + while true A Do at db_to_hive - Sleep - - Done - Exit - fi
The data is incrementally imported into hive every 2 minutes.