Hbase backup and fault recovery methods

Source: Internet
Author: User
Tags failover dns tools

This article will briefly introduce the available data backup mechanism of Apache hbase and the fault recovery/disaster recovery mechanism of massive data.

As hbase is widely used in important commercial systems, many enterprises need to establish robust backup and fault recovery (BDR) for their hbase clusters) mechanism to ensure their enterprise (data) assets. Hbase and Apache hadoop provide many built-in mechanisms to quickly and easily back up and restore Pb-level data.

In this article, you will have a high-level overview of the data backup mechanisms available in hbase and learn about multiple data recovery/Disaster Tolerance Mechanisms. After reading this article, you should be able to make your own judgment on the BDR strategy your business needs. You should also understand the advantages and disadvantages of various mechanisms (applicable to CDH 4.3.0/hbase 0.94.6 and later ).

Backup

Hbase is a distributed data storage system based on the LSM tree (log-structured merge-tree). It uses complex internal mechanisms to ensure data accuracy, consistency, and multi-version. Therefore, how do you obtain consistent data backups of dozens of hfile files and Wals (write-ahead-logs) stored by region servers in HDFS and memory?

Let's talk one by one from minimum destructiveness, minimum data footprint, minimum performance requirement mechanisms and working methods:

  • Snapshots
  • Replication
  • Export
  • Copytable
  • Htable API
  • Offline backup of HDFS data

The following table provides a quick comparison of these methods. The specific details are described below.


Snapshots (snapshot)

Hbase has rich snapshot functions and many features. You do not need to disable the cluster when creating hbase. For more details about snapshot, see apache hbase snapshot introduction.

By creating a storage file with the same hard link as UNIX in HDFS, snapshots can easily capture the information of your hbase table at a certain time point (figure 1 ). These snapshots can be completed in a few seconds, with almost no performance impact on the entire cluster. In addition, it only occupies a negligible space. Except for a small amount of directory data stored in the metadata file, your data is not redundant. Snapshots allow your system to roll back to (create a snapshot) at that time. Of course, you need to restore the snapshot.


Figure 1

Create a table snapshot by running the following command in hbase shell:

hbase(main):001:0>  snapshot 'myTable', 'MySnapShot'
After executing this command, you will find some small data files in HDFS. Snapshot information is stored in/hbase/. Snapshot/mytable (cdh4) or hbase/. hbase-snapshots (APACHE 0.94.6.1. To recover data, run the following command in shell:

hbase(main):002:0>  disable 'myTable'hbase(main):003:0>  restore_snapshot 'MySnapShot'hbase(main):004:0>  enable 'myTable'
As you can see, restoring a snapshot requires offline operations on the table. Once a snapshot is restored, any data added or updated after the snapshot time will be lost. If your business needs to be like this: you must have a remote backup of data, you can use the exportsnapshot command to assign the data of a table to your local HDFS or the remote HDFS you choose.

Snapshot is the complete image of your table at a certain time point. Currently, no incremental snapshot function is available.

Hbase replication (hbase relication)

Hbase value assignment is another backup tool with lighter load. Hbase replication overview provides a detailed description of hbase replication. In general, the value assignment is defined as a column cluster level. It can work in the background and ensure that all editing operations are synchronized between cluster replication chains.

There are three replication modes: Master-> slave (master-> slave), Master <-> master (Master <-> master), and cyclic ). This method gives you the flexibility to obtain data from any data center and ensure that it can obtain all copies in other data centers. In the case of a catastrophic failure in a data center, client applications can use DNS tools to redirect to another standby location.

Replication is a powerful and fault-tolerant process. It provides "eventual consistency", meaning that the recent edits to a table may not be applied to all copies of the table at any time, but will eventually ensure consistency.

Note: For an existing table, you need to manually copy the source table to the target table using other methods described in this article. Replication is only effective for new write/edit operations after you start it.


Table 2 cluster replication Architecture

Export (Export)

The hbase export tool is a built-in utility that enables data to easily import sequencefiles under the HDFS directory from hbase tables. It creates a map reduce task and calls the cluster through a series of hbase APIs to obtain each row of data in the specified table and write the data to the specified HDFS directory. This tool is performance-intensive for clusters because it uses mapreduce and hbase client APIs. However, it has rich functions, supports version or date range, and supports data filtering to make incremental backup available.

The following is a simple example of an Export command:

hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir>
Once your table is exported, you can copy the generated data files to any location you want to store (such as remote/offline Cluster Storage ). You can execute a remote HDFS cluster/directory as the output DIRECTORY parameter of the command, so that the data will be exported directly to the remote cluster. This method requires a network, so you should ensure that the network connection to the remote cluster is reliable and fast.

Copy table)

The copy table function is described in detail in the article "using copytable to back up hbase online", but the basic summary is provided here. Similar to the export function, copying a table also creates a mapreduce task using hbase API to read data from the source table. The difference is that the output of the copy table is another table in hbase, which can be in the local cluster or remote cluster.

A simple example is as follows:

hbase org.apache.hadoop.hbase.mapreduce.CopyTable --new.name=testCopy test
This command copies the table named test to another table named testcopy in the cluster.

Note that there is an obvious performance overhead. It uses an independent "puts" operation to write data to the target table row by row. If your table is very large, copying the table will cause the memstore on the target region server to be filled up, resulting in flush operations, merge operations, garbage collection, and so on.

In addition, you must consider the performance impact of running mapreduce tasks on hbase. For large datasets, the effect of this method may be unsatisfactory.

Hbase API (for example, as a Java application)

Because hadoop is always used in this way, you can use public APIs to write custom client applications to directly query tables. You can also use the batch processing advantages of mapreduce tasks or other methods designed by yourself. However, this method requires a deep understanding of hadoop development and the impact on production clusters.

Offline backup of native HDFS data (offline backup of raw HDFS data)

The most powerful backup mechanism is also the most destructive. It involves the largest amount of data space. You can close your hbase cluster and manually copy data on HDFS. Because hbase has been disabled, you can ensure that all data has been persisted to the hfile file on HDFS, and you will be able to obtain the most accurate data copy. However, incremental data is almost no longer available, and you cannot determine which data has changed.

At the same time, it is worth noting that restoring your data requires an offline metadata because the. Meta. Table will contain information that may be invalid during restoration. This method requires a fast and reliable network to transmit remote data, if it needs to be restored later.

For these reasons, cloudera does not encourage this backup method in hbase.

Fault recovery (disaster recory)

Hbase is designed as a distributed system that can tolerate errors very frequently. Fault recovery in hbase usually takes the following forms:

  • A catastrophic fault at the data center level needs to be switched to the backup location;
  • The previous copy of data due to user errors or accidental deletion needs to be restored;
  • Restore the real-time point data copy capability for audit purposes

As with other fault recovery plans, the business needs to drive how much you architecture and invest. Once you confirm the backup solution you want to select, the recovery will have the following types:

  • Failover to backup Cluster
  • Import table/restore Snapshot
  • Point to the root directory of hbase in the backup location

If your backup policy is like this, And you copy your hbase data in the backup cluster of different data centers, failover will become simple and you only need to use DNS technology, transfer your application.

Remember, if you want to allow data to be written to your backup cluster during the shutdown, make sure that the data can be returned to the Host group after the shutdown. The master <-> master or cyclic replication architecture can automatically process this process, but for a Master/Slave structure, you need to perform manual intervention.

You can also simply modify the hbase of the hbase-site.xml in case of failure. root. dir attribute to change the hbase root directory, but this is the most unsatisfactory restoration option, because when you copy the data and return it to the production cluster, as mentioned earlier, you may find. meta is not synchronized.

Summary

To sum up, recovering data from a loss or interruption requires a well-designed BDR plan. It is strongly recommended that you fully understand your business needs, and then understand data accuracy/availability and the maximum time for fault recovery. With this knowledge, you can better choose tools that meet these needs.

Choosing a tool is just the beginning. You should conduct a large-scale test of your BDR policy to ensure that it is functional under your infrastructure. In addition, you should be very familiar with all the fault recovery steps.

(The translation is not very good. Please forgive me for any errors .)

Address: http://blog.cloudera.com/blog/2013/11/approaches-to-backup-and-disaster-recovery-in-hbase/

Reprinted please indicate the source:Http://blog.csdn.net/iAm333

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.