Hadoop shell command Overview

Last Update:2014-07-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop supports command line operations on the HDFS file system, and supports shell-like commands to interact with HDFS file systems. For most programmers, shell-like command line operations are quite familiar, in fact, this is also one of the great conveniences of hadoop, at least for those who want to be familiar with or be skilled in operating HDFS as soon as possible.

Hadoop shell commands are often used to operate files in HDFS at work. Sometimes, you need to search for files again because the hadoop shell commands are unfamiliar, or you need the parameters of a command; or you need to know the differences between similar commands. As a result, this article summarizes the hadoop shell commands to facilitate your future work, which is also a summary of similar work.

hadoop fs FsShell      Usage: java FsShell           [-ls <path>]           [-lsr <path>]           [-df [<path>]]           [-du <path>]           [-dus <path>]           [-count[-q] <path>]           [-mv <src> <dst>]           [-cp <src> <dst>]           [-rm [-skipTrash] <path>]           [-rmr [-skipTrash] <path>]           [-expunge]           [-put <localsrc> ... <dst>]           [-copyFromLocal <localsrc> ... <dst>]           [-moveFromLocal <localsrc> ... <dst>]           [-get [-ignoreCrc] [-crc] <src> <localdst>]           [-getmerge <src> <localdst> [addnl]]           [-cat <src>]           [-text <src>]           [-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>]           [-moveToLocal [-crc] <src> <localdst>]           [-mkdir <path>]           [-setrep [-R] [-w] <rep> <path/file>]           [-touchz <path>]           [-test -[ezd] <path>]           [-stat [format] <path>]           [-snapshot <path>]           [-tail [-f] <file>]           [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]           [-chown [-R] [OWNER][:[GROUP]] PATH...]           [-chgrp [-R] GROUP PATH...]           [-help [cmd]]

The following is a detailed explanation of the command operations. These commands are similar to those in UNIX, and some of them can understand their meaning.

hadoop fs -ls <path>

Returns the statistics of the file path, including:
Permissions number_of_replicas userid groupid filesize modification_date modification_time filename

hadoop fs -lsr <path>

This is the recursive version of LS, similar to the difference between the LS-R command and LS.

hadoop fs -du URI

Displays the file size.

hadoop fs -dus URI

Similar to Du-S: displays the total size of a file and Its subdirectories.

hadoop fs -df <path>

Displays the size of the file system used by hadoop.

hadoop fs -count [-q] <path>

Displays the directory quantity, file size, and other information under path. By default, the following information is displayed:
Dir_count, file_count, content_size file_name

After the-q Information is added, more information is output:
Quota, remaining_quata, space_quota, remaining_space_quota, dir_count, file_count, content_size, file_name

hadoop fs -mv <src> <dst>

From SRC to DST, multiple sources can be moved to the same DST. dst must be a directory.

hadoop fs -cp <src> ... <dst>

Copy multiple sources to DST. The limit is that DST must be a directory.

hadoop fs -rm [-skipTrash] <path>

You cannot delete a directory because the file is deleted.
-Skiptrash: directly deletes the file and does not store it in. Trash.

hadoop fs -rmr [-skipTrash] <path>

This can iteratively delete directories and their files
-Skiptrash: directly deletes the file and does not store it in. Trash.

hadoop fs -expunge

Trash is cleared. For more information about trash, see http://hadoop.apache.org/docs/r1.0.4/hdfs_design.html.

When a file is deleted by a user or an application, it is not immediately removed from HDFS. Instead, HDFS first renames it to a file in the /trash directory. The file can be restored quickly as long as it remains in /trash. A file remains in /trash for a configurable amount of time. After the expiry of its life in /trash, the NameNode deletes the file from the HDFS namespace. The deletion of a file causes the blocks associated with the file to be freed. Note that there could be an appreciable time delay between the time a file is deleted by a user and the time of the corresponding increase in free space in HDFS.A user can Undelete a file after deleting it as long as it remains in the /trash directory. If a user wants to undelete a file that he/she has deleted, he/she can navigate the /trash directory and retrieve the file. The /trash directory contains only the latest copy of the file that was deleted. The /trash directory is just like any other directory with one special feature: HDFS applies specified policies to automatically delete files from this directory. The current default policy is to delete files from /trash that are more than 6 hours old. In the future, this policy will be configurable through a well defined interface.

File Transfer:

hadoop fs -put <localsrc> ... <dst>

Copy one or more local FS directories or files to the target file system.

hadoop fs -copyFromLocal <localsrc> ... <dst>

Similar to the PUT command, the only restriction is that SRC must be a local file.

hadoop fs -moveFromLocal <localsrc> ... <dst>

Similar to the PUT command, this command will delete local files after localsrc is executed. Note that it is used to delete local files.

hadoop fs -get [-ignoreCrc] <localsrc> ... <dst>

Copy SRC on FS to the local DST directory
-Ignorecrc: The CRC test is ignored during copy. To copy CRCs, you need to add the-CRC parameter.

hadoop fs -getmerge <src> <localdst> [addnl]

SRC is the source directory, and localdst is the local target file. It connects all files in the source directory to the destination file at cost. Addnl is optional and is used to specify a line break at the end of each file.

hadoop fs -cat <src>

Output the SRC content to stdout, which is similar to the cat function in UNIX.

hadoop fs -text <src>

Output the SRC file in text format, and output the file in ZIP or textrecordinputstream format.

hadoop fs -copyToLocal [-ignoreCrc] [-crc] <src> <localdst>

Similar to the GET command, the only restriction is that DST must be a file in the local file system.

hadoop fs -moveToLocal [-crc] <src> <localdst>

Output: '-movetolocal' is not implemented yet, which has not been implemented yet.

hadoop fs -mkdir <path>

Create a path folder. If the parent directory of path does not exist, it is iteratively created, similar to the mkdir-p command.

hadoop fs -setrep [-R] <rep> <path/file>

Modify the number of copies of HDFS files or directories. For important files, you need to increase the number of copies to ensure they are not lost or damaged.
The-R parameter indicates iterative update, and the number of copies under the directory is also updated.

hadoop fs -touchz <path>

Create a file with a size of 0.

hadoop fs -test -[ezd] <path>

Test the Directory attribute of the file.-E: Check whether the file exists;-Z: Check whether the file size is 0;-D: Check whether the file is a directory.

hadoop fs -stat [format] <path>

Returns the statistics of the directory.

hadoop fs -tail [-f] <file>

The last 1kb of the file is displayed. The-F parameter is the same as the parameter in UNIX.

hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...hadoop fs -chown [-R] [OWNER][:[GROUP]] PATH...hadoop fs -chgrp [-R] GROUP PATH...

These three are permission operation commands, which are similar to the functions of Unix commands.

The hadoop shell command is relatively simple, but the difference must be realized only when it is used. This article is just a memo to sort out common commands in the work.

From: http://isilic.iteye.com/blog/1770036

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More