copytable
(1) First, take a look at how the copytable command is used
$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable usage:copytable [General Options] [--starttime=x] [--endtime= Y] [--new.name=new] [--peer.adr=adr] <tablename> Options:rs.class Hbase.regionserver.class of the peer Cluste R specify if different from current cluster Rs.impl Hbase.regionserver.impl of the peer cluster StartRow t He start row stoprow the stop row starttime beginning of the time range (Unixtime in Millis) without Endtim E means from starttime to forever Endtime end of the time range.
Ignored if no StartTime specified. Versions number of cell versions to copy new.name new table ' s name Peer.adr Address of the peer cluster give N in the format hbase.zookeeer.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent families comma-separate
D List of families to copy-to-copy from CF1 to CF2, give sourcecfname:destcfname. To keep the same name, just give "Cfname" All.cells also copy delete Markers and deleted cells args:tablename Name of the table to copy examples:to copy ' TestTable ' to a cluster th At uses replication for a 1 hour window: $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable--starttime=1265875194 289--endtime=1265878794289--peer.adr=server1,server2,server3:2181:/hbase--FAMILIES=MYOLDCF:MYNEWCF,CF2,CF3 Test Table for performance Consider the following general options:-dhbase.client.scanner.caching=100-dmapred.map.tasks.spe Culative.execution=false
(2) Three steps to play data migration
Collecting the splits of a table from cluster A (source cluster)
Gets the End key,end Key for the table that you want to migrate from the HBase Web UI (hbasemaster:16010), which is splits
The purpose of collecting end key is to pre-segment the tables on the destination cluster so that the data is not all concentrated on a region after migration, and load balancing is achieved
Create a table using the splits collected from step 1 on cluster B (destination cluster)
Create ' tablename ', ' cf-name ', {splits = ' endkey-1 ', ' endkey-2 ',...., ' endkey-n '}
Migrating table data from Clustera to Clusterb using the copytable command
Execute on Clustera:
$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
-dhbase.client.scanner.caching=1000
- Dmapred.map.tasks.speculative.execution=false
--peer.adr=worker1,worker2,..., workern:2181:/hbase
' TableName
Precautions
Add worker1,worker2,..., workern IP address to the Clustera/etc/hosts, or you will be stuck at "map 0% reduce 0%" When executing the MapReduce job.
sudo vim/etc/hosts
192.168.1.100 worker1
192.168.1.101 worker2
....