Summary: Hadoop HDFS file operations are often done in two ways, command-line mode and JAVAAPI mode. This article describes how to work with HDFs files in both ways.
Keywords: HDFs file command-line Java API
HDFs is a distributed file system designed for the distributed processing of massive data in the framework of MapReduce.
The HDFs file operation of Hadoop is often done in two ways, one is the command line, that is, Hadoop provides a set of command-line tools similar to the Linux File command, and the other is Javaapi, which uses Hadoop's Java library to programmatically manipulate HDFs files.
Mode one: Command line mode
The Hadoop file Operation command is in the form
Hadoop fs-cmd <args>
Description: CMD is a specific file Operation command,<args> is a set of variable parameters.
Hadoop's most commonly used file manipulation commands include adding files and directories, getting files, deleting files, and so on.
1 Adding Files and directories
HDFs has a default working directory/usr/$USER, where $user is your login username and the author's username is root. The directory cannot be created automatically and requires the mkdir command to be created.
Hadoop fs-mkdir/usr/root
Use the command put of Hadoop to send the local file README.txt to HDFs.
Hadoop fs-put README.txt.
Note that the last parameter of this command is a period (.), which means that the local file is placed in the default working directory, which is equivalent to:
Hadoop fs-put Readme.txt/user/root
Using the LS command of Hadoop, which is
Hadoop Fs-ls
The display results are shown in Figure 1.
Figure 1 The LS command demo in Hadoop
2 Getting files
The Get file contains two levels of meaning, one is that HDFs gets the file from the local file, the add file described earlier, and the local file gets the file from HDFs, and you can use Hadoop's get command. For example, if the local file does not have a README.txt file and needs to be retrieved from HDFs, you can execute the following command.
Hadoop fs-get README.txt.
Or
Hadoop Fs-get Readme.txt/usr/root/readme.txt
3 Deleting files
The Hadoop Delete File command is RM. For example, to delete a README.txt uploaded from a local file, you can execute the following command.
Hadoop fs-rm README.txt
4 Retrieving files
Retrieving files is a look at the contents of files in HDFs, and you can use the cat commands in Hadoop. For example, to view the contents of a README.txt, you can execute the following command.
Hadoop Fs-cat README.txt
The partial display results are shown in Figure 2
Figure 2 The Cat command demo in Hadoop
In addition, the output of the cat command for Hadoop can also be passed to the head of the UNIX command using a pipeline:
Hadoop Fs-cat README.txt | Head
Hadoop also supports the tail command to see the last 1000 bytes. For example, to check the last 1000 bytes of README.txt, you can execute the following command.
Hadoop Fs-tail README.txt
5 Check Help
Check out the Hadoop command Help, which allows us to master and use Hadoop commands very well. We can perform a full command column for Hadoop FS to get the version of Hadoop, or you can use Help to display the usage and brief description of a specific command.
For example, to understand the LS command, execute the following command.
Hadoop fs-help ls
The description of the Hadoop command LS is shown in Figure 3.
Figure 3 Introduction to the Hadoop command ls
Resource:
1 http://www.wangluqing.com/2014/03/hadoop-hdfs-fileoperation/
2 Hadoop in Action http://www.manning.com/lam/