Basic shell operations for HDFS

Source: Internet
Author: User
Tags hadoop fs

(1) Distributed File system

As the amount of data is increasing and the scope of an operating system is not enough, it is allocated to more operating system-managed disks, but it is not easy to manage and maintain, so a system is urgently needed to manage files on multiple machines, which is distributed file management system. It is a file system that allows files to be shared across multiple hosts over a network, allowing multiple users on multiple machines to share files and storage space.

And its most important characteristic is permeability. Let's actually access the file through the network action, by the program and the user, it is like accessing the local disk generally. Even if some nodes are offline in the system, the system can continue to operate without loss of data as a whole.

There are many distributed file management systems, and HDFs is just one of them. Applies to the case that writes multiple queries at once, does not support concurrent write situations, and small files are inappropriate.

(2) Common shell operations for HDFs.

HDFs is also an operating system. Its shell operation is similar to Linux.

① Viewing directories

Command: Hadoop fs-ls PATH

For example: Hadoop fs-ls hdfs://myhadoop:9000/

Here's a question, why finally add hdfs://myhadoop:9000/, here is the root directory of HDFs, we can look at/usr/local/hadoop/ The Core-site.xml file in Conf, which configures the root directory of HDFs in this configuration file:

Of course, using Hadoop fs-ls/can also be viewed, it defaults to the contents of the configuration file, Hadoop FS-LSR recursively displays the directory structure of the current path.

② recursively view all files in the root directory:

Command: Hadoop FS-LSR hdfs://myhadoop:9000/

, depending on the space, the second display parameter represents the number of copies of the file, which means that a file may exist in HDFs, which enables a backup of the file, where the representation of "-" is a directory, all non-existent replicas, and the default number of replicas is 3. This is because of the pseudo-distribution system, so it is set to 1. The Hdfs-site.xml file in/usr/local/hadoop/conf can be configured accordingly:

③ Creating a Directory

Command: Hadoop fs-mkdir PATH

For example: Hadoop fs-mkdir/d1

④ Uploading Files

Command: Hadoop fs-put source file (Linux system) Destination path (HDFS)

For example, upload the Core-site.xml file in the current directory to the/D1 directory you just created.

The command is: Hadoop fs-put./core-site.xml/d1

⑤ Download File

Command: Hadoop fs-get source file (HDFS) destination path (Linux system)

For example: Download the Core-site.xml file in the/d1 directory just in HDFs to the Linux system desktop.

The command is: Hadoop fs-get/d1/core-site.xml/root/desktop/

⑥ Viewing files

Command: Hadoop fs-text FILE

For example: View the Core-site.xml file under D1.

The command is: Hadoop fs-text/d1/core-site.xml

⑦ Deleting files

Command: Hadoop fs-rm FILE

Example: Delete the Core-site.xml file under/d1

⑧ Recursive deletion

Command: Hadoop FS-RMR PATH

Example: recursively delete all content under/D1

⑨ help manual for viewing commands

Command: Hadoop fs-help command

Example: see the Help manual for LS

FS Shell
The call file system (FS) shell command should use the form Bin/hadoop FS. All of the FS shell commands use URI paths as parameters. The URI format is Scheme://authority/path. For the HDFs file system, Scheme is HDFS, to the local file system, scheme is file. The scheme and authority parameters are optional, and if not specified, the default scheme specified in the configuration is used. An HDFs file or directory such as/parent/child can be represented as hdfs://namenode:namenodeport/parent/child, or simpler/parent/ Child (assuming that the default value in your configuration file is Namenode:namenodeport). The behavior of most FS shell commands is similar to that of the corresponding Unix shell commands, and the differences are noted below when the commands are used in detail. Error messages are output to stderr, and other information is output to stdout.

Cat
How to use: Hadoop fs-cat uri [uri ...] outputs the contents of the path-specified file to stdout.
Example:
Hadoop fs-cat hdfs://host1:port1/file1 hdfs://host2:port2/file2
Hadoop fs-cat file:///file3/user/hadoop/file4
Return value: Successfully returned 0, failure returned-1.

Chgrp
How to use: Hadoop fs-chgrp [-R] GROUP uri [uri ...] Change Group Association of files. With-r, make the change recursively through the directory structure. The user must be the owner of files, or else a super-user. Additional information is on the Permissions User Guide. –>
Change the group to which the file belongs. Using-R will make the changes recursive under the directory structure. The user of the command must be the owner or superuser of the file. For more information, see the HDFs Permissions User Guide.

chmod
How to use: Hadoop fs-chmod [-r] Uri [URI ...]
Permissions to change the file. Using-R will make the changes recursive under the directory structure. The user of the command must be the owner or superuser of the file. For more information, see the HDFs Permissions User Guide.

Chown
How to use: Hadoop Fs-chown [-R] [Owner][:[group]] uri [URI]
Change the owner of the file. Using-R will make the changes recursive under the directory structure. The user of the command must be a superuser. For more information, see the HDFs Permissions User Guide.

Copyfromlocal
How to use: Hadoop fs-copyfromlocal URI
In addition to qualifying the source path as a local file, it is similar to the put command.

Copytolocal
How to use: Hadoop fs-copytolocal [-IGNORECRC] [-CRC] URI
In addition to qualifying the target path as a local file, it is similar to the Get command.

Cp
How to use: Hadoop fs-cp uri [uri ...]
Copies the file from the source path to the destination path. This command allows for multiple source paths, at which point the destination path must be a directory. Example:
Hadoop Fs-cp/user/hadoop/file1/user/hadoop/file2
Hadoop Fs-cp/user/hadoop/file1/user/hadoop/file2/user/hadoop/dir
Return value: Successfully returned 0, failure returned-1.

Du
How to use: Hadoop fs-du uri [uri ...] Displays the size of all files in the directory, or when you specify only one file, the size of this file is displayed. Example: Hadoop fs-du/user/hadoop/dir1/user/hadoop/file1 hdfs://host:port/user/hadoop/dir1 return value: Successfully returned 0, failure returned-1.

Dus
How to use: Hadoop Fs-dus Displays the size of the file.

Expunge
How to use: Hadoop fs-expunge
Empty the Recycle Bin. Refer to the HDFs design documentation for more information about the properties of the Recycle Bin.

Get
How to use: Hadoop fs-get [-IGNORECRC] [-CRC] copies files to the local file system. The-IGNORECRC option can be used to replicate the failed file for CRC validation. Use the-CRC option to copy files and CRC information. Example:
Hadoop fs-get/user/hadoop/file LocalFile
Hadoop fs-get hdfs://host:port/user/hadoop/file localfile
Return value: Successfully returned 0, failure returned-1.

Getmerge
How to use: Hadoop fs-getmerge [ADDNL] accepts a source directory and a destination file as input, and connects all the files in the source directory to the local destination file. ADDNL is optional and is used to specify that a line break is added at the end of each file.

Ls
How to use: Hadoop fs-ls If it is a file, the file information is returned in the following format: File size Modified Date Modify permissions User ID Group ID if it is a directory, it returns a list of its immediate sub-files, as in Unix. The directory returns information for the list as follows: Directory name modification date Modify time permissions User ID Group ID Example: Hadoop fs-ls/user/hadoop/file1/user/hadoop/file2 Hdfs://host:port/user/hadoo P/dir1/nonexistentfile return value: Successfully returned 0, failure returned-1.

Lsr
How to use: The recursive version of the Hadoop FS-LSR ls command. Similar to the Ls-r in Unix.

Mkdir
How to use: Hadoop Fs-mkdir takes the URI specified by the path as a parameter to create these directories. It behaves like a Unix mkdir-p, which creates levels of parent directories in the path. Example:
Hadoop Fs-mkdir/user/hadoop/dir1/user/hadoop/dir2
Hadoop fs-mkdir Hdfs://host1:port1/user/hadoop/dir hdfs://host2:port2/user/hadoop/dir
Return value: Successfully returned 0, failure returned-1.

Movefromlocal
How to use: dfs-movefromlocal
Outputs a "not implemented" message.

Mv
How to use: Hadoop fs-mv uri [uri ...] moves the file from the source path to the destination path. This command allows for multiple source paths, at which point the destination path must be a directory. Files are not allowed to move between different file systems. Example:
Hadoop Fs-mv/user/hadoop/file1/user/hadoop/file2
Hadoop fs-mv hdfs://host:port/file1 hdfs://host:port/file2 hdfs://host:port/file3 hdfs://host:port/dir1
Return value: Successfully returned 0, failure returned-1.

Put
How to use: Hadoop fs-put ... Copy single or multiple source paths from the local file system to the target file system. Read input from standard input is also supported to write to the target file system.
Hadoop fs-put Localfile/user/hadoop/hadoopfile
Hadoop fs-put localfile1 localfile2/user/hadoop/hadoopdir
Hadoop fs-put LocalFile hdfs://host:port/hadoop/hadoopfile
The Hadoop fs-put–hdfs://host:port/hadoop/hadoopfile reads the input from the standard input.
Return value: Successfully returned 0, failure returned-1.

Rm
How to use: Hadoop fs-rm uri [uri ...] deletes the specified file. Only non-empty directories and files are deleted. Refer to the RMR command for recursive deletions. Example:
Hadoop fs-rm Hdfs://host:port/file/user/hadoop/emptydir
Return value: Successfully returned 0, failure returned-1.

RMr
Usage: Hadoop FS-RMR A recursive version of the URI [uri ...] Delete. Example:
Hadoop Fs-rmr/user/hadoop/dir
Hadoop FS-RMR Hdfs://host:port/user/hadoop/dir
Return value: Successfully returned 0, failure returned-1.

Setrep
How to use: Hadoop Fs-setrep [-r] Changes the copy factor of a file. The-r option is used to recursively change the copy factor for all files in the directory. Example:
Hadoop fs-setrep-w 3-r/user/hadoop/dir1
Return value: Successfully returned 0, failure returned-1.

Stat
How to use: Hadoop fs-stat uri [uri ...] returns statistics for the specified path. Example:
Hadoop fs-stat Path
Return value: Successfully returned 0, failure returned-1.

Tail
How to use: The Hadoop fs-tail [-f] URI outputs the contents of the file trailing 1K bytes to the stdout. The-f option is supported, and behaves the same as UNIX.
Example:
Hadoop Fs-tail Pathname
Return value: Successfully returned 0, failure returned-1.

Test
How to use: Hadoop fs-test-[ezd] URI option:-e checks if a file exists. Returns 0 if it exists. -Z Checks if the file is 0 bytes. Returns 0 if it is. -D returns 1 if the path is a directory, otherwise 0 is returned. Example:
Hadoop fs-test-e filename

Text
How to use: Hadoop fs-text output The source file as text format. The allowed formats are zip and Textrecordinputstream.

Touchz
How to use: Hadoop fs-touchz uri [uri ...] creates an empty 0-byte file. Example:
Hadoop-touchz Pathname
Return value: Successfully returned 0, failure returned-1.

(3) Summary of all shell operations in HDFs:

Option name Use formatting Meaning
-ls -ls < paths > View the current directory structure for the specified path
-lsr -LSR < paths > Recursively view the directory structure of a specified path
-du -du < paths > Statistics directory next file size
-dus -dus < paths > Size of file (clip) in summary statistics directory
-count -count [-Q] < paths > Number of statistical files (clips)
-mv -MV < Source path > < destination path > Move
-cp -CP < Source path > < destination path > Copy
-rm -RM [-skiptrash] < paths > Delete files/blank folders
-rmr -RMR [-skiptrash] < paths > Recursive deletion
-put -put < files on multiple Linux > Uploading files
-copyfromlocal -copyfromlocal < files on multiple Linux > Copy from local
-movefromlocal -movefromlocal < files on multiple Linux > Move from local
-getmerge -getmerge < Source path > <linux path > Merge to Local
-cat -cat View File Contents
-text -text View File Contents
-copytolocal -copytolocal [-IGNORECRC] [-CRC] [HDFs source path] [Linux destination Path] Copy from local
-movetolocal -movetolocal [-CRC] Move from local
-mkdir -mkdir Create a blank folder
-setrep -setrep [-R] [-W] < number of copies > < paths > Modify the number of replicas
-touchz -touchz < file paths > Create a blank file
-stat -stat [format] < paths > displaying file statistics
-tail -tail [-F] < files > View File trailer Information
-chmod -chmod [-R] < permission mode > [path] Modify Permissions
-chown -chown [-R] [genus Master][:[]] path Modify Owner
-chgrp -CHGRP [-R] Genus group name path Modify Genus Group
-help -help [Command Options] Help

Basic shell operations for HDFs

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.