Use snapshot to implement HDFs file backup and recovery combat

Source: Internet
Author: User
Tags hdfs dfs

Enable backup of files on HDFs via snapshot

API address please see http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.2.0/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html

==========================================================================================

1. Allow snapshot creation

First, execute the command below the folder where you want to make the backup, allowing the folder to create a snapshot

HDFs Dfsadmin-allowsnapshot <path>

Example: HDFs Dfsadmin-allowsnapshot/workspace/linlin

This command appears to prove that the snapshot was allowed to be created successfully

=================================================================================================

2. Create a Snapshot

The next step is to start a backup of this folder

HDFs dfs-createsnapshot <path> [name]

such as HDFs Dfs-createsnapshot/workspace/linlin BAK1

This command appears to prove that the snapshot has been created successfully

At this point we can consider whether you can create a snapshot in the Linlin subdirectory

HDFs ' on the directory structure

Then try to create a snapshot on the Snaptest

HDFs dfs-createsnapshot/workspace/linlin/snaptest BAK2

Error, visible, can only be in the directory under your permission to create a snapshot;

The first snapshot bak1 when there is no Snaptest folder, now more Snaptest folder, and then create a snapshot

If you still use the

HDFs Dfs-createsnapshot/workspace/linlin BAK1

There is an error indicating that the snapshot name already exists

Performing HDFs Dfs-createsnapshot/workspace/linlin BAK2

Creation success;

==============================================================================================================

3. View Snapshots

See all Snapshottable

   HDFs Lssnapshottabledir

To view all directories that were once allowed to create a snapshot
View files under the current snapshot  when Hadoop sanpshot creates a snapshot, the default folder is. Snapshot you must add the. Snapshot to see what's back there;
. Snapshot is a product of later Hadoop, so it is not possible to create a snapshot if a folder is named snapshot keyword;
Execute command HDFs dfs-ls/workspace/linlin/.snapshot/

Can see three backups under this snapshot are Bak1, BAK2, Linlin, respectively.

===========================================================================================================

4. Compare Snapshots

Compare snapshots to see the difference between backup files between two snapshots

Execute command

HDFs Snapshotdiff <path> <fromSnapshot> <toSnapshot>
Execute command HDFs Snapshotdiff  

Results Results:
+ The file/directory has been created.
- The file/directory has been deleted.
M The file/directory has been modified.
R The file/directory has been renamed.
The presence of M here means I have modified the Linlin folder, and the + represents a new folder Snaptest
=============================================================================================================== =============
5. Recovering snapshots
To recover a snapshot:

hdfs dfs -cp <path> <path>

Example: HDFs dfs-cp/workspace/linlin/.snapshot/bak2/snaptest/workspace

To view the HDFs directory:

has been successfully restored to Workspace
=============================================================================================================== ===================
Off Topic:
We can try to delete the folder where the snapshot was created: it cannot be deleted, it will prompt

A snapshot cannot be deleted, proving that a folder cannot be deleted or moved if a snapshot is created under a folder

Original: HTTP://WWW.NOSQLCN.COM/SHOWARTICLE/23

Use snapshot to implement HDFs file backup and recovery combat

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.