HBase (v): hbase Operations Management

Last Update:2016-09-11 Source: Internet

Author: User

Tags hdfs dfs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Many of the tools that hbase comes with can be used for management, analysis, repair, and debugging, and the portal is part of the HBase shell client, and the other part is in HBase's jar package.

Directory:

Hbck
hfile
Data backup and Recovery

1. Snapshots
2. Replication
3. Export
4. Copytable
5. Htable API
6. Offline Backup of HDFS data

HBCK:

The Hbck tool is used for the detection and repair of HBase's underlying file system, including the status of Master, Regionserver in memory, and the consistency between the state of data on HDFs, black hole problems, alignment metadata inconsistencies, and more.
Command: hbase hbck-help view parameter Help options
Command: hbase hbck-details Show full report for all region
Command: hbase hbck-metaonly only detects the state of the metadata table, such as:
Quick Fix Command:
Command:hbase hbck-repair-ignoreprecheckpermission
Command:hbase hbck-repairholes-ignoreprecheckpermission
For an example of application, see:hbase (iii): Azure hdinsigt hbase table Data import local HBase

hfile:

View hfile file content tools, commands and parameters as follows:
Command:hbase hfile-p-f/apps/hbase/data/data/default/pertest/7685e6c39d1394d94e26cf5ddafb7f9f/d/ 3EF195CA65044ECA93CFA147414B56C2
Effects such as:

Data backup and Recovery:

Common backup recovery methods such as: Reference document: http://blog.cloudera.com/blog/2013/11/ approaches-to-backup-and-disaster-recovery-in-hbase/

Snapshots:

HBase snapshots feature rich features that do not need to be shut down when creating a cluster
Snapshots can be completed in seconds, with virtually no performance impact on the entire cluster. And it only takes up a negligible amount of space
Enable fast need to set the Hbase-site.xml file to hbase.snapshot.enabled true
Command: snapshot ' pertest ', ' snappertest ' creates a snapshot named Snappertest based on table pertest
Command: list_snapshots View Snapshot list
After the snapshot is created, the. Hbase-snapshots directory is generated in the HBase directory, storing the snapshot information, such as the lower-right corner of the image
Command:delete_snapshot ' snappertest ' Delete snapshot
Recovering a snapshot requires offline operation of the table. Once the snapshot is resumed, any additional/updated data that was made after the snapshot time is lost, with the following command:
```
' pertest ' restore_snapshot   ' snappertest '  'pertest'
```
Command:clone_snapshot ' snappertest ', ' PerTest1 ' according to the Quick Clone new table (Note: Clone comes out of a new table without a copy of the data), such as:
Exportsnapshot Tool Snapshot Export utility command: hbase org.apache.hadoop.hbase.snapshot.exportsnapshot-snapshot ' snappertest '- Copy-to/apps/hbase/data/zhu
Note: If you can access to another cluster, the following address can be changed directly to another cluster HDFs directory
The exported file structure is as follows

Replication:

HBase replication is another low-load backup tool. is defined as a column cluster level that can work in the background and ensure that all edits are synchronized between the cluster replication chains
There are three modes of replication: master-and-Slave (master->slave), Master <-> Master (master<->master), and loop (cyclic). This approach gives you the flexibility to get data from any data center and to make sure that it gets all the copies in other datacenters. In the event of a catastrophic failure in one data center, the client application can take advantage of the DNS tool to redirect to another alternate location
Note: for an existing table, you need to manually copy the source table to the destination table through the other methods described in this article. Replication is only valid for new write/edit operations after you launch it
Replication is a powerful, fault-tolerant process. It provides "eventual consistency", meaning that the most recent edits to a table may not be applied to all copies of the table at any one time, but will ultimately ensure consistency.

Export:

Export is a built-in utility for hbase that makes it easy for data to output hbase table content as an HDFs sequencefiles file
Use the map reduce task to get each row of data for a specified table through a series of HBase APIs, and write data to the specified HDFs directory
Example Description: Cluster A: hbase cluster under Windows system created by Hdinsight, cluster B is based on the HBase cluster under the LIUNX system that azure virtual is creating, and the table Stocksinfo table in cluster A is exported to the HDFs directory of cluster B. Unfortunately, two clusters can not communicate, only pilot to local, and then manually upload
Command syntax: hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> examples are as follows:
The exported file structure is as follows:
Command:hdfs dfs-get/zhu c:/zhu download to a cluster a node C drive, manually upload to Liunx, such as
Import the data into the B-cluster hbase table using the Imports command as follows: (Note: The file entered in the directory must be the Export command exported file format)
Command syntax:hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir> such as:
View the HBase table, if OK

Copytable:

Like the export feature, copy tables also use the HBase API to create a MapReduce task to read data from the source table. The difference is that the output from the copy table is another table in HBase , which can be in a local cluster or in a remote cluster
It uses a separate "puts" operation to write data line-by-row to the destination table. If your table is very large, copying the table will cause the Memstore on the target region server to be filled up, causing the flush operation and eventually the merge operation to occur, with garbage collection operations, and so on
You must consider the performance impact of running a mapreduce task on hbase. For large datasets, the effect of this approach is less than ideal
Command syntax:hbase org.apache.hadoop.hbase.mapreduce.CopyTable--new.name=pertest2 pertest (Copy the table named Test to another table in the cluster testcopy) as
Note: If you use--new.name =xxx, first this new table will be defined before

Offline Backup of HDFS data:

See also:hbase (iii): Azure hdinsigt hbase table Data import local HBase

HBase (v): hbase Operations Management

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More