HBase (v): hbase Operations Management

Source: Internet
Author: User
Tags hdfs dfs

Many of the tools that hbase comes with can be used for management, analysis, repair, and debugging, and the portal is part of the HBase shell client, and the other part is in HBase's jar package.

Directory:

    • Hbck
    • hfile
    • Data backup and Recovery
      1. Snapshots
      2. Replication
      3. Export
      4. Copytable
      5. Htable API
      6. Offline Backup of HDFS data

HBCK:

  • The Hbck tool is used for the detection and repair of HBase's underlying file system, including the status of Master, Regionserver in memory, and the consistency between the state of data on HDFs, black hole problems, alignment metadata inconsistencies, and more.
  • Command: hbase hbck-help view parameter Help options
  • Command: hbase hbck-details Show full report for all region
  • Command: hbase hbck-metaonly only detects the state of the metadata table, such as:
  • Quick Fix Command:
  • Command:hbase hbck-repair-ignoreprecheckpermission
  • Command:hbase hbck-repairholes-ignoreprecheckpermission
  • For an example of application, see:hbase (iii): Azure hdinsigt hbase table Data import local HBase

hfile:

    • View hfile file content tools, commands and parameters as follows:
    • Command:hbase hfile-p-f/apps/hbase/data/data/default/pertest/7685e6c39d1394d94e26cf5ddafb7f9f/d/ 3EF195CA65044ECA93CFA147414B56C2
    • Effects such as:

Data backup and Recovery:

    • Common backup recovery methods such as: Reference document: http://blog.cloudera.com/blog/2013/11/ approaches-to-backup-and-disaster-recovery-in-hbase/

Snapshots:

  • HBase snapshots feature rich features that do not need to be shut down when creating a cluster
  • Snapshots can be completed in seconds, with virtually no performance impact on the entire cluster. And it only takes up a negligible amount of space
  • Enable fast need to set the Hbase-site.xml file to hbase.snapshot.enabled true
  • Command: snapshot ' pertest ', ' snappertest ' creates a snapshot named Snappertest based on table pertest
  • Command: list_snapshots View Snapshot list
  • After the snapshot is created, the. Hbase-snapshots directory is generated in the HBase directory, storing the snapshot information, such as the lower-right corner of the image
  • Command:delete_snapshot ' snappertest ' Delete snapshot
  • Recovering a snapshot requires offline operation of the table. Once the snapshot is resumed, any additional/updated data that was made after the snapshot time is lost, with the following command:
    ' pertest ' restore_snapshot   ' snappertest '  'pertest'
  • Command:clone_snapshot ' snappertest ', ' PerTest1 ' according to the Quick Clone new table (Note: Clone comes out of a new table without a copy of the data), such as:

  • Exportsnapshot Tool Snapshot Export utility command: hbase org.apache.hadoop.hbase.snapshot.exportsnapshot-snapshot ' snappertest '- Copy-to/apps/hbase/data/zhu
  • Note: If you can access to another cluster, the following address can be changed directly to another cluster HDFs directory
  • The exported file structure is as follows

Replication:

    • HBase replication is another low-load backup tool. is defined as a column cluster level that can work in the background and ensure that all edits are synchronized between the cluster replication chains
    • There are three modes of replication: master-and-Slave (master->slave), Master <-> Master (master<->master), and loop (cyclic). This approach gives you the flexibility to get data from any data center and to make sure that it gets all the copies in other datacenters. In the event of a catastrophic failure in one data center, the client application can take advantage of the DNS tool to redirect to another alternate location
    • Note: for an existing table, you need to manually copy the source table to the destination table through the other methods described in this article. Replication is only valid for new write/edit operations after you launch it
    • Replication is a powerful, fault-tolerant process. It provides "eventual consistency", meaning that the most recent edits to a table may not be applied to all copies of the table at any one time, but will ultimately ensure consistency.

Export:

  • Export is a built-in utility for hbase that makes it easy for data to output hbase table content as an HDFs sequencefiles file
  • Use the map reduce task to get each row of data for a specified table through a series of HBase APIs, and write data to the specified HDFs directory
  • Example Description: Cluster A: hbase cluster under Windows system created by Hdinsight, cluster B is based on the HBase cluster under the LIUNX system that azure virtual is creating, and the table Stocksinfo table in cluster A is exported to the HDFs directory of cluster B. Unfortunately, two clusters can not communicate, only pilot to local, and then manually upload
  • Command syntax: hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> examples are as follows:
  • The exported file structure is as follows:
  • Command:hdfs dfs-get/zhu c:/zhu download to a cluster a node C drive, manually upload to Liunx, such as
  • Import the data into the B-cluster hbase table using the Imports command as follows: (Note: The file entered in the directory must be the Export command exported file format)
  • Command syntax:hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir> such as:
  • View the HBase table, if OK

Copytable:

    • Like the export feature, copy tables also use the HBase API to create a MapReduce task to read data from the source table. The difference is that the output from the copy table is another table in HBase , which can be in a local cluster or in a remote cluster
    • It uses a separate "puts" operation to write data line-by-row to the destination table. If your table is very large, copying the table will cause the Memstore on the target region server to be filled up, causing the flush operation and eventually the merge operation to occur, with garbage collection operations, and so on
    • You must consider the performance impact of running a mapreduce task on hbase. For large datasets, the effect of this approach is less than ideal
    • Command syntax:hbase org.apache.hadoop.hbase.mapreduce.CopyTable--new.name=pertest2 pertest (Copy the table named Test to another table in the cluster testcopy) as
    • Note: If you use--new.name =xxx, first this new table will be defined before

Offline Backup of HDFS data:

    • See also:hbase (iii): Azure hdinsigt hbase table Data import local HBase

HBase (v): hbase Operations Management

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.