Full HDFS command manual-1

Source: Internet
Author: User
Tags hdfs dfs
HDFS is designed to follow the file operation commands in Linux, so you are familiar with Linux file commands. In addition, the concept of pwd is not available in HadoopDFS, and all require full paths. (This document is based on version 2.5CDH5.2.1) to list command lists, formats, and help, and to select a namenode for non-parameter file configuration. Hdfsdfs-

HDFS is designed to follow the file operation commands in Linux, so you are familiar with Linux file commands. In addition, the concept of pwd is not available in Hadoop DFS, and all of them require full paths. (This article is based on version 2.5 CDH 5.2.1) list the command list, format, and help, and select a namenode for non-parameter file configuration. Hdfs dfs-

HDFS is designed to follow the file operation commands in Linux, so you are familiar with Linux file commands. In addition, the concept of pwd is not available in Hadoop DFS, and all of them require full paths. (This article is based on version 2.5 CDH 5.2.1)
LIST command lists, formats, and help, and select a namenode configured in a non-parameter file.

hdfs dfs -usagehadoop dfs -usage ls hadoop dfs -help-fs 
 
        specify a namenodehdfs dfs -fs hdfs://test1:9000 -ls /
 

---------------------------
-Df [-h] [path…] :
Shows the capacity, free and used space of the filesystem. If the filesystem has
Multiple partitions, and no path to a special partition is specified, then
The status of the root partitions will be shown.

$ hdfs dfs -dfFilesystem                 Size   Used     Available  Use%hdfs://test1:9000  413544071168  98304  345612906496    0%

---------------------------
-Mkdir [-p] path... :
Create a directory in specified location.

-P Do not fail if the directory already exists

-Rmdir dir... :
Removes the directory entry specified by each directory argument, provided it is
Empty.

hdfs dfs -mkdir /tmphdfs dfs -mkdir /tmp/txthdfs dfs -rmdir /tmp/txthdfs dfs -mkdir -p /tmp/txt/hello

---------------------------
-CopyFromLocal [-f] [-p] localsrc... Dst:
Identical to the-put command.

-CopyToLocal [-p] [-ignoreCrc] [-crc] src... Localdst:
Identical to the-get command.

-MoveFromLocal localsrc...
Same as-put, couldn't that the source is deleted after it's copied.

-Put [-f] [-p] localsrc...
Copy files from the local file system into fs. Copying fails if the file already
Exists, unless the-f flag is given. Passing-p preserves access and
Modification times, ownership and the mode. Passing-f overwrites
Destination if it already exists.

-Get [-p] [-ignoreCrc] [-crc] src... Localdst:
Copy files that match the file pattern src to the local name. src is kept.
When copying multiple files, the destination must B/e a directory. Passing-p
Preserves access and modification times, ownership and the mode.

-Getmerge [-nl] src localdst:
Get all the files in the directories that match the source file pattern and
Merge and sort them to only one file on local fs. src is kept.

-Nl Add a newline character at the end of each file.

-Cat [-ignoreCrc] src... :
Fetch all files that match the file pattern src and display their content on
Stdout.

# Wildcards? * {} [] Hdfs dfs-cat/tmp/*. txtHello, HadoopHello, HDFShdfs dfs-cat/tmp/h? Fs.txt Hello, HDFShdfs dfs-cat/tmp/h {a, d }*. txt Hello, HadoopHello, HDFShdfs dfs-cat/tmp/h [a-d] *. txtHello, HadoopHello, HDFSecho "Hello, Hadoop"> hadoop.txt echo "Hello, HDFS"> hdfs.txt dd if =/dev/zero of =/tmp/test. zero bs = 1 M count = 1024 1024 + 0 records in 1024 + 0 records out 1073741824 bytes (1.1 GB) copied, 0.93978 s, 1.1 GB/shdfs dfs-moveFromLocal/tmp/test. zero/tmphdfs dfs-put *. txt/tmp

---------------------------
-Ls [-d] [-h] [-R] [path…] :
List the contents that match the specified file pattern. If path is not
Specified, the contents of/user/currentUser will be listed. Directory entries
Are of the form:
Permissions-userId groupId sizeOfDirectory (in bytes)
ModificationDate (yyyy-MM-dd HH: mm) directoryName

And file entries are of the form:
Permissions numberOfReplicas userId groupId sizeOfFile (in bytes)
ModificationDate (yyyy-MM-dd HH: mm) fileName

-D Directories are listed as plain files.
-H Formats the sizes of files in a human-readable fashion rather than a number
Of bytes.
-R Recursively list the contents of directories.

hdfs dfs -ls /tmphdfs dfs -ls -d /tmphdfs dfs -ls -h /tmp  Found 4 items  -rw-r--r--   3 hdfs supergroup         14 2014-12-18 10:00 /tmp/hadoop.txt  -rw-r--r--   3 hdfs supergroup         12 2014-12-18 10:00 /tmp/hdfs.txt  -rw-r--r--   3 hdfs supergroup        1 G 2014-12-18 10:19 /tmp/test.zero  drwxr-xr-x   - hdfs supergroup          0 2014-12-18 10:07 /tmp/txthdfs dfs -ls -R -h /tmp  -rw-r--r--   3 hdfs supergroup         14 2014-12-18 10:00 /tmp/hadoop.txt  -rw-r--r--   3 hdfs supergroup         12 2014-12-18 10:00 /tmp/hdfs.txt  -rw-r--r--   3 hdfs supergroup        1 G 2014-12-18 10:19 /tmp/test.zero  drwxr-xr-x   - hdfs supergroup          0 2014-12-18 10:07 /tmp/txt  drwxr-xr-x   - hdfs supergroup          0 2014-12-18 10:07 /tmp/txt/hello

---------------------------
-Checksum src... :
Dump checksum information for files that match the file pattern src to stdout.
Note that this requires a round-trip to a datanode storing each block of
File, and thus is not efficient to run on a large number of files. The checksum
Of a file depends on its content, block size and the checksum algorithm and
Parameters used for creating the file.

hdfs dfs -checksum /tmp/test.zero  /tmp/test.zeroMD5-of-262144MD5-of-512CRC32C000002000000000000040000f960570129a4ef3a7e179073adceae97

---------------------------
-AppendToFile localsrc... Dst:
Appends the contents of all the given local files to the given dst file. The dst
File will be created if it does not exist. If localSrc is-, then the input is
Read from stdin.

hdfs dfs -appendToFile *.txt hello.txthdfs dfs -cat hello.txt  Hello, Hadoop  Hello, HDFS

---------------------------
-Tail [-f] file:
Show the last 1KB of the file.

hdfs dfs -tail -f hello.txt#waiting for output. then Ctrl + C#another terminalhdfs dfs -appendToFile - hello.txt#then type something

---------------------------
-Cp [-f] [-p |-p [topax] src...
Copy files that match the file pattern src to a destination. When copying
Multiple files, the destination must be a directory. Passing-p preserves status
[Topax] (timestamps, ownership, permission, ACLs, XAttr). If-p is specified
With no arg, then preserves timestamps, ownership, permission. If-pa is
Permission. Passing-f overwrites the destination if it already exists. raw
Namespace extended attributes are preserved if (1) they are supported (HDFS
Only) and, (2) all of the source and target pathnames are in the/. reserved/raw
Hierarchy. raw namespace xattr preservation is determined solely by the presence
(Or absence) of the/. reserved/raw prefix and not by the-p option.
-Mv src... Dst:
Move files that match the specified file pattern src to a destination dst.
When moving multiple files, the destination must be a directory.
-Rm [-f] [-r |-R] [-skipTrash] src... :
Delete all files that match the specified file pattern. Equivalent to the Unix
Command "rm src"

-SkipTrash option bypasses trash, if enabled, and immediately deletes src
-F If the file does not exist, do not display a diagnostic message or
Modify the exit status to reflect an error.
-[RR] Recursively deletes directories
-Stat [format] path... :
Print statistics about the file/directory at path in the specified format.
Format accepts filesize in blocks (% B), group name of owner (% g), filename (% n ),
Block size (% o), replication (% r), user name of owner (% u), modification date
(% Y, % Y)

hdfs dfs -stat /tmp/hadoop.txt    2014-12-18 02:00:08hdfs dfs -cp -p -f /tmp/hello.txt /tmp/hello.txt.bakhdfs dfs -stat /tmp/hadoop.txt.bakhdfs dfs -rm /tmp/not_exists    rm: `/tmp/not_exists': No such file or directoryecho $?    1hdfs dfs -rm -f /tmp/123321123123123echo $?0

---------------------------
-Count [-q] path... :
Count the number of directories, files and bytes under the paths
That match the specified file pattern. The output columns are:
DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME or
QUOTA REMAINING_QUOTA SPACE_QUOTA REMAINING_SPACE_QUOTA
DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME

-Du [-s] [-h] path... :
Show the amount of space, in bytes, used by the files that match the specified
File pattern. The following flags are optional:

-S Rather than showing the size of each inpidual file that matches
Pattern, shows the total (summary) size.
-H Formats the sizes of files in a human-readable fashion rather than a number
Of bytes.

Note that, even without the-s option, this only shows size summaries one level
Deep into a directory.

The output is in the form
Sizename (full path)

hdfs dfs -count /tmp           3            3         1073741850 /tmphdfs dfs -du /tmp    14          /tmp/hadoop.txt    12          /tmp/hdfs.txt    1073741824  /tmp/test.zero    0           /tmp/txthdfs dfs -du -s /tmp    1073741850  /tmphdfs dfs -du -s -h /tmp    1.0 G  /tmp

---------------------------
-Chgrp [-R] group path... :
This is equivalent to-chown... : GROUP...

-Chmod [-R] MODE [, MODE]… | Octalmode path... :
Changes permissions of a file. This works similar to the shell's chmod command
With a few exceptions.

-R modifies the files recursively. This is the only option currently
Supported.
MODE Mode is the same as mode used for the shell's command. The only
Letters recognized are 'rwxxt ', e.g. + t, a + r, g-w, + rwx, o = r.
OCTALMODE Mode specifed in 3 or 4 digits. If 4 digits, the first may be 1 or
0 to turn the sticky bit on or off, respectively. Unlike
Shell command, it is not possible to specify only part of
Mode, e.g. 754 is same as u = rwx, g = rx, o = r.

If none of 'augo 'is specified, 'A' is assumed and unlike the shell command, no
Umask is applied.

-Chown [-R] [OWNER] [: [GROUP] PATH... :
Changes owner and group of a file. This is similar to the shell's chown command
With a few exceptions.

-R modifies the files recursively. This is the only option currently
Supported.

If only the owner or group is specified, then only the owner or group is
Modified. The owner and group names may only consist of digits, alphabet, and
Any of [-_./@ a-zA-Z0-9]. The names are case sensitive.

WARNING: Avoid using '.' to separate user name and group though Linux allows it.
If user names have dots in them and you are using local file system, you might
See surprising results since the shell command 'chown 'is used for local files.

-Touchz path... :
Creates a file of zero length at path with current time as the timestamp
That path. An error is returned if the file exists with non-zero length

Hdfs dfs-mkdir-p/user/spark/tmphdfs dfs-chown-R spark: hadoop/user/sparkhdfs dfs-chmod-R 775/user/spark/tmphdfs dfs-ls-d/user/spark/tmp drwxrwxr-x-spark hadoop 0/user/ spark/tmphdfs dfs-chmod + t/user/spark/tmp # user: spark hdfs dfs-touchz/user/spark/tmp/own_by_spark # user: hadoopuseradd-g hadoop hadoopsu-hadoopid uid = 502 (hadoop) gid = 492 (hadoop) groups = 492 (hadoop) hdfs dfs-rm/user/spark/tmp/own_by_sparkrm: Permission denied by sticky bit setting: user = hadoop, inode = own_by_spark # use super Administrator (dfs. permissions. superusergroup = hdfs), you can ignore the sticky bit settings

---------------------------
-Test-[defsz] path:
Answer various questions about path, with result via exit status.
-D return 0 if path is a directory.
-E return 0 if path exists.
-F return 0 if path is a file.
-S return 0 if file path is greater than zero bytes in size.
-Z return 0 if file path is zero bytes in size, else return 1.

hdfs dfs -test -d /tmpecho $?    0hdfs dfs -test -f /tmp/txtecho $?    1

---------------------------
-Setrep [-R] [-w] rep path... :
Set the replication level of a file. If path is a directory then the command
Recursively changes the replication factor of all files under the directory tree
Rooted at path.
-W It requests that the command waits for the replication to complete. This
Can potentially take a very long time.

hdfs fsck /tmp/test.zero -blocks -locations    Average block replication:3.0hdfs dfs -setrep -w 4  /tmp/test.zero    Replication 4 set: /tmp/test.zero    Waiting for /tmp/test.zero .... donehdfs fsck /tmp/test.zero -blocks    Average block replication:4.0

This article from: http://debugo.com, original address: http://debugo.com/hdfs-cmd1/, thanks to the original author to share.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.