Hadoop's HDFs file operation

Source: Internet
Author: User
Tags mkdir hadoop fs

Summary: Hadoop HDFS file operations are often done in two ways, command-line mode and JAVAAPI mode. This article describes how to work with HDFs files in both ways.

Keywords: HDFs file command-line Java API

HDFs is a distributed file system designed for the distributed processing of massive data in the framework of MapReduce.

The HDFs file operation of Hadoop is often done in two ways, one is the command line, that is, Hadoop provides a set of command-line tools similar to the Linux File command, and the other is Javaapi, which uses Hadoop's Java library to programmatically manipulate HDFs files.

Mode one: Command line mode

The Hadoop file Operation command is in the form

Hadoop fs-cmd <args>

Description: CMD is a specific file Operation command,<args> is a set of variable parameters.

Hadoop's most commonly used file manipulation commands include adding files and directories, getting files, deleting files, and so on.

1 Adding Files and directories

HDFs has a default working directory/usr/$USER, where $user is your login username and the author's username is root. The directory cannot be created automatically and requires the mkdir command to be created.

Hadoop fs-mkdir/usr/root

Use the command put of Hadoop to send the local file README.txt to HDFs.

Hadoop fs-put README.txt.

Note that the last parameter of this command is a period (.), which means that the local file is placed in the default working directory, which is equivalent to:

Hadoop fs-put Readme.txt/user/root

Using the LS command of Hadoop, which is

Hadoop Fs-ls

The display results are shown in Figure 1.

Figure 1 The LS command demo in Hadoop

2 Getting files

The Get file contains two levels of meaning, one is that HDFs gets the file from the local file, the add file described earlier, and the local file gets the file from HDFs, and you can use Hadoop's get command. For example, if the local file does not have a README.txt file and needs to be retrieved from HDFs, you can execute the following command.

Hadoop fs-get README.txt.

Or

Hadoop Fs-get Readme.txt/usr/root/readme.txt

3 Deleting files

The Hadoop Delete File command is RM. For example, to delete a README.txt uploaded from a local file, you can execute the following command.

Hadoop fs-rm README.txt

4 Retrieving files

Retrieving files is a look at the contents of files in HDFs, and you can use the cat commands in Hadoop. For example, to view the contents of a README.txt, you can execute the following command.

Hadoop Fs-cat README.txt

The partial display results are shown in Figure 2

Figure 2 The Cat command demo in Hadoop

In addition, the output of the cat command for Hadoop can also be passed to the head of the UNIX command using a pipeline:

Hadoop Fs-cat README.txt | Head

Hadoop also supports the tail command to see the last 1000 bytes. For example, to check the last 1000 bytes of README.txt, you can execute the following command.

Hadoop Fs-tail README.txt

5 Check Help

Check out the Hadoop command Help, which allows us to master and use Hadoop commands very well. We can perform a full command column for Hadoop FS to get the version of Hadoop, or you can use Help to display the usage and brief description of a specific command.

For example, to understand the LS command, execute the following command.

Hadoop fs-help ls

The description of the Hadoop command LS is shown in Figure 3.

Figure 3 Introduction to the Hadoop command ls

Resource:

1 http://www.wangluqing.com/2014/03/hadoop-hdfs-fileoperation/

2 Hadoop in Action http://www.manning.com/lam/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.