HBase Snapshot (Snapshot) technology

Source: Internet
Author: User

What is a snapshot

A snapshot is a collection of meta-information that allows an administrator to revert to the previous state of the table. A snapshot is not a copy of a table, but a list of file names, so data is not copied.
Full Snapshot recovery refers to reverting to the previous "table structure" and the data at that time, and the data that occurs after the snapshot is not recovered.

the role of snapshots

The method of backing up or cloning tables that exist in HBase is to copy all the hfile in HDFs using the Copy/export table or after closing the table.
Copying or exporting is a series of tools that call MapReduce to scan and copy tables, which can have a direct effect on regionserver. Closing a table stops all read and write operations and is often unacceptable in the real world.
In contrast, hbase snapshots allow administrators to clone a table without copying data, which has minimal impact on domain servers. exporting a snapshot to another cluster does not directly affect any server; the export is only a cluster of data synchronization with some additional logic.

Snapshot Benefits

Export snapshots and copy/export tables in addition to better consistency, the main difference is that the export snapshot is done at the HDFs level. This means that Hmaster and domain servers are independent of the operation. Therefore, there is no need to create cache space for unnecessary data, and there will be no scanning process. Because of the GC pauses caused by a large number of object creation, the main performance impact for HBase is Datanode additional network and disk load.

Application Scenarios

1, restore from the user/application exception.
2. Recover/Restore from a known security state.
3. View previous snapshots and selectively merge different write product environments.
4. Save the snapshot when the main application is upgraded or revised.
5. Review and/or report data at specified times.
6. capture Monthly data according to regulations.
7. Generate the end of day/month/quarter report.
8, application test.
9, through the snapshot simulation of the production environment structure or application changes, test completion can be discarded.
For example: Generate a snapshot, build a new table with the contents of the snapshot (original structure + data) and modify the new structure, add or remove columns, and so on. (Original table, snapshot, and new table remain independent of each other)
10, reduce work pressure.
11. Generate snapshots, import to other clusters, and run MapReduce jobs. Because the exported fast is the HDFS level, it does not reduce the efficiency of the HBase master cluster as if it were replicated.

Snapshot Operations

To generate a snapshot:
This operation attempts to generate a snapshot of the specified table. If the cluster performs operations such as data balancing, partitioning, or merging, it may cause the operation to fail.

To clone a snapshot:
This operation constructs a new table using the same structure data as the specified snapshot. The result of the operation produces a fully functional table, and any modifications to that table will not affect the original table or snapshot.

To Restore a snapshot:
This operation restores the table structure and data to the state when the snapshot was generated. (Note: This operation will discard any changes after the snapshot was generated.)

To Delete a snapshot:
This action removes the snapshot from the system, frees up the disk space that is not shared, and does not affect other clones or snapshots.

To export a snapshot:
This operation copies snapshot data and metadata to other clusters. The operation will only involve HDFS and will not have any contact with hmaster or regionserver, so the HBase cluster can be shut down.

Demo

Verify that the snapshot license is turned on by checking hbase-site.xml hbase.snapshot.enabled Whether it is set to true.

1. Get a snapshot of the specified table using the snapshot command (no file copy is generated)
hbase>snapshot ‘tableName‘,‘snapshotName‘

2. List all the snapshots, using the list_snapshot command. The snapshot name, the source table, and the date and time of creation are displayed
hbase>list_snapshots


3. Delete snapshot using deleted_snapshot command. Deleting a snapshot does not affect the cloned table or the resulting snapshot.
hbase>delete_snapshot ‘snapshotName‘

4. Use clone_snapshot the command to generate a new table (clone) from the specified snapshot. Since data replication is not generated, the final data used will not be twice times that of the previous one.
hbsse>clone_snapshot ‘snapshotName‘,‘newTableName‘

5. Use restore_snapshot the command to replace the specified snapshot content with the current table structure or data;
hbase>restore_snapshot ‘snapshotName‘


6. Use the Exportsnapshot tool to export an existing snapshot to another cluster. The export tool does not affect the load on the domain server, it only works at the HDFs level, so you need to specify the HDFs path (the hbase root of the other cluster).
hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshotName -copy-to hdfs :///srv2:8082/hbase

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

HBase Snapshot (Snapshot) technology

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.