Common operations for HDFs

Source: Internet
Author: User
Tags free ssh

This article address: http://www.cnblogs.com/archimedes/p/hdfs-operations.html, reprint please indicate source address.

1. File operation under HDFs

1. List HDFs files

List files under HDFs with the "-ls" command

[Email protected]:~/opt/hadoop-0.20. 2$ bin/hadoop dfs-ls

Execution Result:

Note: the "-ls" command without parameters in HDFs does not return any value, it returns the contents of the "Home" directory in HDFs by default. In HDFs, there is no such concept as the current working directory, and there is no CD command

2. list files in a document in the HDFs directory

Shown here is the "-ls file name" command to browse for files in a document named in under HDFs

[Email protected]:~/opt/hadoop-0.20. 2$ bin/hadoop dfs-ls in

3. Uploading files to HDFs

Shown here is the "-put file 1 File 2" command to upload the Test1 file in the hadoop-0.20.2 directory to HDFs and rename it to test

[Email protected]:~/opt/hadoop-0.20. 2$ bin/hadoop dfs-put test1 test

Note: There are only two possible ways to execute "-put", which is execution success and execution failure. When uploading a file, the file is first copied to Datanode, and only all the Datanode have successfully received the data, and the file upload is successful.

4. Copy the files in HDFs to the local system

Shown here is the "-get file 1 File 2" command that copies the in file in HDFs to the local system and is named Getin:

[Email protected]:~/opt/hadoop-0.20. 2  in Getin

5. Delete the document under HDFs

The "-RMR file" command is shown here to remove the document named out under HDFs:

[Email protected]:~/opt/hadoop-0.20. 2$ bin/hadoop DFS-RMR out

After executing the command, see that only one in file is left, and the deletion succeeds:

6. View a file under HDFs

The "-cat file" command is shown here to view the contents of the HDFs in file:

[Email protected]:~/opt/hadoop-0.20. 2$ bin/hadoop DFS-catin/*

Output:

Hello World
Hello Hadoop

Ps:bin/hadoop DFS commands are much more than that, but these commands are useful for other operations, which can be further learned through the list of "-help commandName" commands.

2. Management and update

1. Report the basic statistics of HDFS

View basic statistics for HDFS with the "-report" command:

[Email protected]:~/opt/hadoop-0.20. 2$ bin/hadoop dfsadmin-report

The execution results are as follows:

14/12/02 05:19:05 WARN Conf. Configuration:deprecated:hadoop-site.xml found in the classpath. Usage of Hadoop-site.xml is deprecated. Instead use Core-site.xml, Mapred-site.xml and Hdfs-site.xml to override properties of Core-default.xml, Mapred-default.x ML and hdfs-default.xml respectively
configured Capacity: 19945680896 (18.58 GB)
present capacity:13558165504 (12.63 GB)
dfs remaining:13558099968 (12.63 GB)
dfs used:65536 (KB)
dfs used%: 0%
under replicated blocks:1
blocks with corrupt replicas:0
missing blocks:0

-------------------------------------------------
Datanodes available:1 (1 total, 0 dead)

name:127.0.0.1:50010
Decommission Status:normal
Configured capacity:19945680896 (18.58 GB)
DFS used:65536 (KB)
Non DFS used:6387515392 (5.95 GB)
DFS remaining:13558099968 (12.63 GB)
DFS used%: 0%
DFS remaining%: 67.98%
Last Contact:tue Dec 05:19:04 PST 2014

2. Exit Safe Mode

The namenode automatically enters Safe mode when it is started. Security mode is a state of Namenode, at which time the file system is not allowed to have any modifications. The purpose of the security mode is to check the individual Datanode at system startup

The data block is valid, and the data block is copied and deleted according to the policy, and when the minimum percentage of the block satisfies the configured minimum number of copies, the security mode is automatically exited.

[Email protected]:~/opt/hadoop-0.20. 2$ bin/hadoop dfsadmin-safemode leave

3. Enter Safe mode

[Email protected]:~/opt/hadoop-0.20. 2$ bin/hadoop Dfsadmin-safemode Enter

4. Adding nodes

Extensibility is an important feature of HDFs, and adding nodes to an HDFS cluster is easy to implement. Add a new Datanode node, first install HADOOP on the newly added node, to use the same configuration as Namenode, modify the $hadoop_home/conf/master file, and add the Namenode hostname. Then modify the $hadoop_home/conf/slaves file on the Namenode node and join the new node hostname. Then establish a password-free SSH connection to the new node and run the start command:

$ bin/start-all. SH

Via http://(hostname) 50070 to see the new Datanode node added successfully

5. Load Balancing

Users can use the following command to rebalance the distribution of data blocks on Datanode:

$ bin/start-balancer. SH

Resources

"Combat HADOP: Open the way to cloud computing." Liu Peng

Common operations for HDFs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.