Original link: http://blog.csdn.net/ashic/article/details/47068183
Official Document Link: http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html
Overview
The HDFs snapshot is a read-only, point-in-time file system copy. You can take a snapshot of a subdirectory in the file system or the entire file system. Snapshots are often used as data backups to prevent user errors and disaster recovery.
The creation of HDFS snapshots is efficient:
The creation of a snapshot is "instantaneous": the time to go out to find the inode, Cost is O (1)
Additional memory usage, O (m), and m for modified files or directories only when modifying snapshot
Blocks in Datanode is not replicated: The snapshot only records the block list and file size.
Snapshot does not affect the normal operation of HDFs: modifications are recorded in reverse order of time, so that the latest data can be read directly. The snapshot data is computed from the current data minus the modified part.
snapshottable Directories
Only directories that are set to snapshottable can create snapshots. A directory that is set to Snapshottable can hold 65,536 simultaneous snapshots. The administrator can set any directory to become snapshottable. If a snapshot is stored in the snapshottable, the folder cannot be deleted or renamed until the snapshots are deleted.
If the parent directory of a directory, or subdirectory, is set to snapshottable, then it cannot be set to Snapshottable
Snapshot Paths
When you set a directory to snapshottable and create a snapshot, a ". Snapshot" directory is generated in this directory to hold the snapshot. Assuming that the/foo directory is set to Snapshottable,bar is a file or directory in/foo, you created a snapshot s0 for/foo. Then a snapshot of bar is stored in the/foo/.snapshot/s0/bar.
Common APIs and CLI can be run under the ". Snapshot" path. Here are some examples:
列出snapshottable目录的所有快照 hdfs dfs -ls /foo/.snapshot 列出在快照s0的所有文件 hdfs dfs -ls /foo/.snapshot/s0 从s0拷贝一个文件: hdfs dfs -cp /foo/.snapshot/s0/bar /tmp
Snapshot Operations
The following operation requires the Superuser permission
Allow snapshots allows a directory to create snapshots. If the operation succeeds, this directory is the snapshottable directory HDFs dfsadmin-allowsnapshot <path> [[email protected] oracle]# HDFs DFS Admin-allowsnapshot/snap allowing Snaphot On/snap succeededdisallow snapshotsdisallowing before all snapshots need to be removed HDFs Dfsad Min-disallowsnapshot <path> [r[email protected] oracle]# HDFs dfsadmin-disallowsnapshot/snap D Isallowing snaphot on/snap succeededcreate Snapshots Create a snapshot of the directory (the Snapshottable directory). Need to have owner permissions on this directory HDFs dfs-createsnapshot <path> [<snapshotname>] default if Snapshotname is not specified, the default is ' s ' yyyymm Dd-hhmmss.sss ", listed as:" s20130412-151029.033 ". [[email protected] oracle]# HDFs dfs-createsnapshot/snap Created snapshot/snap/.snapshot/s20150726-120414.3 [[email protected] oracle]# hdfs dfs-createsnapshot/snap s0 Created snapshot/snap/.snapshot/s0 [[email protected] oracle]# HDFs dfs-ls-r/snap/.snapshot drwxr-xr-x-root supergroup 0 2015-07-26 12:04/snap/.snapshot/s0-rw-r--r--1 root supergroup 831 2015-07-26 11:56/snap/.snap shot/s0/hehe.ora-rw-r--r--1 root supergroup 2015-07-26 11:55/snap/.snapshot/s0/sum.sh-rw-r- -r--1 root supergroup 754 2015-07-26 11:56/snap/.snapshot/s0/test.sh Drwxr-xr-x-root supergroup 0 2015-07-26 12:04/snap/.snapshot/s20150726-120414.379-rw-r--r--1 root supergroup 831 2015-07-26 1 1:56/snap/.snapshot/s20150726-120414.379/hehe.ora-rw-r--r--1 root supergroup 72 2015-07-26 11:55/snap/.snapshot/s20150726-120414.379/sum. sh-rw-r--r--1 root supergroup 754 2015-07-26 11 : 56/snap/.snapshot/s20150726-120414.379/test.shdelete snapshots HDFs dfs-deletesnapshot <path> < snapshotname> [[email protected] oracle]# HDFs dfs-deletesnapshot/snap s0 [[email protected] oracle]# HDFs Dfs-ls-r /snap/.snapshot drwxr-xr-x-root supergroup 0 2015-07-26 12:04/snap/.snapshot/s20150726-120414.379 -rw-r--r--1 root supergroup 831 2015-07-26 11:56/snap/.snapshot/s20150726-120414.379/hehe.ora-rw-r --r--1 root supergroup 2015-07-26 11:55/snap/.snapshot/s20150726-120414.379/sum.sh-rw-r--r--1 ro OT supergroup 754 2015-07-26 11:56/snap/.snapshot/s20150726-120414.379/test.shrename Snapshots HDFs dfs-rename Snapshot <path> <oldName> <newName> [[email protected] oracle]# HDFs dfs-createsnapshot/s NAP S0 Created snapshot/snap/.snapshot/s0 [[email protected] oracle]# HDFs dfs-renamesnapshot/snap S0 S1 [[email protected] oracle]# Hadoop fs-ls-r/snap/.snapshot drwxr-xr-x-root supergroup 0 2015-07-26 12:10/snap/.snapshot/s1-rw-r--r--1 root supergroup 831 2015-07-26 11:56/snap/.snapsho T/s1/hehe.ora-rw-r--r--1 root supergroup 2015-07-26 11:55/snap/.snapshot/s1/sum.sh-rw-r--r--1 root supergroup 754 2015-07-26 11:56/snap/.snapshot/s1/test.sh drwxr-xr-x-root supergroup 0 2015-07-26 12:04/sn ap/.snapshot/s20150726-120414.379-rw-r--r--1 root supergroup 831 2015-07-26 11:56/snap/.snapshot/s20150 726-120414.379/hehe.ora-rw-r--r--1 root supergroup 2015-07-26 11:55/snap/.snapshot/s20150726-120414 .379/sum.sh-rw-r--r--1 root supergroup 754 2015-07-26 11:56/snap/.snapshot/s20150726-120414.379/test.sh Get snapshottable directory listing get all current users have permission to create snapshot snapshottable directory list HDFs lssnapshottabledir [[email Protected] oracle]# HDFs lssnapshottabledir drwxr-xr-x 0 root supergroup 0 2015-07-26 12:10 2 65536/snapget snapsh OTS Difference Report gets the difference between the two snapshot. This operation requires Read permissions on the directories and files involved in each snapshot HDFs snapshotdiff <path> <fromSnapshot> <toSnapshot> Results: + The file/directory has been created. -The file/directory has been deleted. M the file/directory has been modified. R the File/directory has been renamed. Before doing this experiment, we first delete the/snap/sum.sh file and create a snapshot for/snap S2 [[email protected]2 oracle]# Hadoop fs-rm/snap/sum.sh 15/0 7/26 12:16:20 INFO fs. Trashpolicydefault:namenode Trash configuration:deletion interval = 1440 minutes, emptier interval = 0 minutes. Moved: ' hdfs://localhost:9000/snap/sum.sh ' to trash at:hdfs://localhost:9000/user/root/. Trash/current the above two lines indicate that the file was not completely deleted, but instead moved to the Recycle Bin, which is retained for 1440 minutes [[email protected] oracle]# HDFs dfs-createsnapshot /SNAP S2 Created snapshot/snap/.snapshot/s2 [[email protected] oracle]# hdfs snapshotdiff/snap s1 s2 Difference between snapshot S1, snapshot S2 under Directory/snap:m. -./sum.sh S1 more than S2 a sum.sh or more convenient way to understand is s1-xxx = s2
Snapshot information can be viewed through the web
Http://192.168.255.169:50070/dfshealth.html#tab-snapshot
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced. The Linux commune stole the mother's death.
HDFS Snapshot Learning