I. Introduction to HDFS shell commands
We all know that HDFS is a distributed file system for data access. HDFS operations are basic operations of the file system, such as file creation, modification, deletion, and modification permissions, folder creation, deletion, and renaming. Commands for HDFS are similar to the operations on files by llinux shell, such as LS, mkdir, and RM.
When executing the HDFS shell operation, we must ensure that hadoop runs properly. We can use the JPS command to ensure that each hadoop process is visible.
We can run the hadoop FS command to view all the commands for HDFS shell operations as follows.
[[Email protected] ~] # Hadoop FS
Usage: Java fsshell
[-Ls <path>]
[-LSR <path>]
[-Du <path>]
[-DUS <path>]
[-Count [-q] <path>]
[-MV <SRC> <DST>]
[-CP <SRC> <DST>]
[-RM [-skiptrash] <path>]
[-RMR [-skiptrash] <path>]
[-Expunge]
[-Put <localsrc>... <DST>]
[-Copyfromlocal <localsrc>... <DST>]
[-Movefromlocal <localsrc>... <DST>]
[-Get [-ignorecrc] [-CRC] <SRC> <localdst>]
[-Getmerge <SRC> <localdst> [addnl]
[-Cat <SRC>]
[-Text <SRC>]
[-Copytolocal [-ignorecrc] [-CRC] <SRC> <localdst>]
[-Movetolocal [-CRC] <SRC> <localdst>]
[-Mkdir <path>]
[-Setrep [-R] [-W] <rep> <path/File>]
[-Touchz <path>]
[-Test-[ezd] <path>]
[-Stat [format] <path>]
[-Tail [-F] <File>]
[-Chmod [-R] <mode [, mode]... | octalmode> path...]
[-Chown [-R] [owner] [: [group] path...]
[-Chgrp [-R] group path...]
[-Help [cmd]
Ii. Options of HDFS shell operation commands
Option name |
Format |
Description |
-Ls |
-Ls <path> |
View the current directory structure of the specified path |
-LSR |
-LSR <path> |
Recursively view the directory structure of a specified path |
-Du |
-Du <path> |
Measure the file size in the directory. |
-DUS |
-DUS <path> |
Summarize the file (folder) size in the statistics directory |
-Count |
-Count [-q] <path> |
Count the number of files (CLIPS) |
-MV |
-MV <Source Path> <Destination path> |
Mobile |
-CP |
-CP <Source Path> <Destination path> |
Copy |
-Rm |
-RM [-skiptrash] <path> |
Delete files/empty folders |
-RMR |
-RMR [-skiptrash] <path> |
Recursive Deletion |
-Put |
-Put <files on multiple Linux instances> <HDFS path> |
Upload files |
-Copyfromlocal |
-Copyfromlocal <files on multiple Linux instances> <HDFS path> |
Copy from local |
-Movefromlocal |
-Movefromlocal <files on multiple Linux instances> <HDFS path> |
Move from local |
-Getmerge |
-Getmerge <Source Path> <Linux path> |
Merge to local |
-Cat |
-Cat <HDFS path> |
View File Content |
-Text |
-Text <HDFS path> |
View File Content |
-Copytolocal |
-Copytolocal [-ignorecrc] [-CRC] [HDFS Source Path] [Linux Destination path] |
Copy to local |
-Movetolocal |
-Movetolocal [-CRC] <HDFS Source Path> <Linux Destination path> |
Move to local |
-Setrep |
-Setrep [-R] [-W] <Number of replicas> <path> |
Modify the number of copies |
-Mkdir |
-Mkdir <HDFS path> |
Create a blank folder |
-Touchz |
-Touchz <file path> |
Create a blank File |
-Stat |
-Stat [format] <path> |
Display file statistics |
-Tail |
-Tail [-F] <File> |
View File tail Information |
-Chmod |
-Chmod [-R] <permission mode> [path] |
Modify permissions |
-Chown |
-Chown [-R] [owner] [: [group] path |
Modify owner |
-Chgrp |
-Chgrp [-R] group name path |
Modify Group |
-Help |
-Help [Command Options] |
Help |
Iii. Usage of various Command Options
1. ls displays the current directory structure
<1> This Command Option shows the current directory structure of the specified path, followed by the HDFS path, as shown in Figure 3.1.
Fig 3.1
Let's explain the content format of each row:
The first letter indicates the folder (if it is "D") or the file (if it is "-");
The following nine characters indicate permissions;
The following number or "-" indicates the number of copies. If it is a file, a number is used to indicate the number of copies; there is no copy in the folder;
The following "root" indicates the owner;
The subsequent "supergroup" indicates the group;
The following "0" and "84927175" indicate the file size in bytes;
The following time indicates the modification time, in the format of year, month, and day;
The last entry indicates the file path.
The root directory contains one folder and one file.
<2> If the command option is not followed by a path, the/user/<Current user> directory is accessed. We use the root user to log on, so the/user/root directory of HDFS will be accessed. However, if this directory is not available/user/root, the error 3.2 indicating that the file does not exist will be prompted, run the command 3.3, 3.4 again after adding the directory.
Fig 3.2
Fig 3.3
Fig 3.4
2.-LSR recursive display directory structure
This command option indicates recursively displaying the directory structure of the current path, followed by the HDFS path. 3.5.
Fig 3.5
The/user directory contains a root directory, and the root directory contains the file hello.
3.-du: measure the file size in the directory.
This command shows the file size in the specified path, in bytes, as shown in 3.6.
Fig 3.6
4.-DUS collects statistics on the file size in the directory
This command shows the file size in the specified path, in bytes, as shown in 3.7.
Fig 3.7
Compare the differences between Figure 3.6 and Figure 3.7 to understand the different meanings of the two Command Options.
5. Count count the number of statistical files (CLIPS)
This command displays the number of folders, number of files, and total file size in the specified path, as shown in Figure 3.8 ..
Fig 3.8
There are two commands in Figure 4-6. The following command is used to prove the correctness of the preceding command.
6. Move music videos
This command option indicates moving HDFS files to the specified HDFS directory. The following two paths are followed. The first one indicates the source file and the second one indicates the target directory. 3.9.
Fig 3.9
In Figure 3.9, there are three commands to reflect the changes before and after the movement.
7. CP Replication
This command indicates copying the specified HDFS file to the specified HDFS directory. The following two paths are followed. The first is the copied file, and the second is the destination ., 3.10 ..
Fig 3.10
In Figure 3.10, there are three commands to reflect the changes before and after replication .?
8. rm delete files/blank folders
This command option deletes the specified file or empty directory, as shown in Figure 3.11 ..
Fig 3.11
In Figure 3.11, the first three commands are used to reflect the changes before and after the execution. The fourth command is to delete the non-empty "/user" directory. If the operation fails, the non-empty directory cannot be deleted.
9. Recursive deletion of RMR
This command option recursively deletes all subdirectories and files in the specified directory, as shown in Figure 3.12.
Figure 3.12?
10. Put upload files
This command option indicates copying files from Linux to HDFS, as shown in Figure 3.12 ..
Fig 3.12
11. Copy fromlocal from local to HDFS ------ the operation is consistent with-put, and no example is given.
12. Move movefromlocal from local to HDFS
This command moves the file from Linux to HDFS, as shown in Figure 3.13.
Fig 3.13
13. getmerge merge to local
This command is used to merge all the files in the specified directory of HDFS into a local Linux file, as shown in Figure 3.14.
Fig 3.14
14. Cat View File Content
This command option is to view the file content, as shown in Figure 3.15.
Fig 3.15
15. View File Content in text
This command option can be considered to have the same role and usage as-cat, which is omitted here.
16. Create a blank folder using mkdir
This command option indicates creating a folder. The following path is the folder to be created in HDFS, as shown in Figure 3.16.
Fig 3.16
17. setrep set the number of copies
<1> the command option is to modify the number of copies of saved files, followed by the number of copies, followed by the file path, as shown in Figure 3.17.
Fig 3.17
In Figure 3.17, we modified the number of copies of the file/file1 from 1 to 2, which means that if one copy is added, HDFS will automatically copy the file, generate a new copy.
<2> If the last path indicates a folder, you need to follow the option-R to modify the Secondary attribute for all files in the folder.
, As shown in 3.18
Fig 3.18
In Figure 3.18, the option-R is used for operations on the/user/root folder, so the number of copies of file2 and file1 under/user/root has changed.
<3> another option is-W, indicating that the command is exited only after the copy operation is completed, as shown in Figure 3.19.
Fig 3.19
18. Create a blank file in touchz
This command option creates a blank file in HDFS, as shown in Figure 3.20.
Fig 3.20
19. Stat displays the statistics of the file.
This command option displays some statistics of the file, as shown in Figure 3.21.
Fig 3.21
In Figure 3.21, the command options can be enclosed by quotation marks. The format "% B % N % o % R % Y" in the example indicates the file size, file name, block size, number of copies, and access time in sequence.
20. Run tail to view the content at the end of the file.
This command displays the last 1 K bytes of the file. It is generally used to view logs. If option-F is included, the file content is automatically displayed when the file content changes. 3.22.
Fig 3.22
21. chmod modify File Permissions
<1> This command option is similar to the CHMOD usage in Linux Shell. It is used to modify the file permissions, as shown in Figure 3.23 ..
Fig 3.23
<2> In Figure 3.23, the permissions of the file/emptyfile are modified. If option-R is added, you can modify permissions for all files in the folder, as shown in Figure 3.24 ..
Fig 3.24
22. chown modify owner
This command option indicates the owner of the file, as shown in Figure 3.25.
Fig 3.25
<2> change the owner of the file/emptyfile from root to sunddenly. You can also modify the Group, as shown in Figure 3.26.
Fig 3.26
In Figure 3.26, modify the owner and group of the file/emptyfile to itcast. If you only modify the group, you can
Use ": sunddenly ".
If option-R is included, the owner and group information of all files in the folder can be modified recursively.
23. chgrp modify Group
This command is used to modify the object group. This command is equivalent to the "chown: group" usage, as shown in Figure 3.27.
Figure 3.27?
24. Help help
This command option displays help information, followed by the Command Options to be queried, as shown in 4-27.
Fig 3.28
Figure 3.28 shows the query RM usage.
Hadoop learns day8 --- shell operations of HDFS