Hadoop serial series of five: Hadoop command line explain

Source: Internet
Author: User
Keywords name dfs operation usage
Tags block checkpoint class command line counter create delete directory

1 Hadoop fs

-------------------------------------------------- ------------------------------

The root directory where the hadoop subcommand set executes is the / home directory, which is / user / root on this machine

-------------------------------------------------- ------------------------------

1, lists all Hadoop Shell support commands

$ bin / hadoop fs -help

2, shows the details of a command

$ bin / hadoop fs -help command-http: //www.aliyun.com/zixun/aggregation/11696.html "> name

3, the user can use the following command in the specified path to view the history log summary (output directory)

$ bin / hadoop job -history output-dir

This command shows details of the job, details of the failed and terminated job.

4, on the operation of more details, such as the success of the task, as well as the number of attempts to do each task, etc. can use the following command to view

$ bin / hadoop job -history all output-dir

5, format a new distributed file system

$ bin / hadoop namenode -format

6, in the assigned NameNode, run the following command to start HDFS

$ bin / start-dfs.sh

The bin / start-dfs.sh script starts the DataNode daemon on all listed slaves with reference to the contents of the $ {HADOOP_CONF_DIR} / slaves file on the NameNode.

7, in the assigned JobTracker, run the following command to start Map / Reduce:

$ bin / start-mapred.sh

The bin / start-mapred.sh script references the contents of the $ {HADOOP_CONF_DIR} / slaves file on the JobTracker and starts the TaskTracker daemon on all listed slaves.

8, in the assigned NameNode, execute the following command to stop HDFS:

$ bin / stop-dfs.sh

The bin / stop-dfs.sh script stops the DataNode daemon on all listed slaves, referring to the contents of the $ {HADOOP_CONF_DIR} / slaves file on the NameNode.

9, in the assigned JobTracker, run the following command to stop Map / Reduce:

$ bin / stop-mapred.sh

The bin / stop-mapred.sh script will stop the TaskTracker daemon on all listed slaves by referring to the contents of the $ {HADOOP_CONF_DIR} / slaves file on the JobTracker.

2 DFS Shell

-------------------------------------------------- ------------------------------

10, hadoop dfs command execution root is "/"

11, create a jay directory

$ bin / hadoop dfs -mkdir / jay

12, view the file name / foodir / myfile.txt

$ bin / hadoop dfs -cat /foodir/myfile.txt

3 DFSAdmin

-------------------------------------------------- ------------------------------

13, put the cluster in safe mode

$ bin / hadoop dfsadmin -safemode enter

14, shows Datanode list

$ bin / hadoop dfsadmin -report

15, Datanode node datanodename retired

$ bin / hadoop dfsadmin -decommission datanodename

16, bin / hadoop The dfsadmin -help command lists all currently supported commands. such as:

* -report: reports basic HDFS statistics. Some of the information is also available on the Home page of the NameNodeWeb service.

* -safemode: Administrators can indeed manually get the NameNode into or out of safe mode, although not normally needed.

* -finalizeUpgrade: Removes the cluster backup made during the last upgrade.

17, HDFS explicitly placed in safe mode

$ bin / hadoop dfsadmin -safemode

18, before upgrading, administrators need to use (upgrade end operation) command to delete the existing backup file

$ bin / hadoop dfsadmin -finalizeUpgrade

19, to know whether you need to perform an upgrade to a cluster termination.

$ bin / hadoop dfsadmin -upgradeProgress status

20, run the new version with the -upgrade option

$ bin / start-dfs.sh -upgrade

21, if you need to return to the old version, you must stop the cluster and deploy the old version of Hadoop, use the rollback option to start the cluster

$ bin / start-dfs.h -rollback

22, the following new command or new option is used to support the quota. The first two are administrator commands.

* dfsadmin -setquota ...

Set each directory quota to N. This command will try on each directory. If N is not a positive integer, the directory does not exist or the file name, or the directory exceeds the quota, then an error report will be generated.

* dfsadmin -clrquota ...

Remove quotas for each directory. This command will try every directory, if the directory does not exist or is a file, it will generate an error report. If the directory did not set the quota will not be reported.

* fs -count -q ...

With the -q option, the quotas set for each directory and the remaining quotas are reported. If the directory does not set a quota, it reports none and inf.

23, create a hadoop file

$ hadoop archive -archiveName NAME *

-archiveName NAME The name of the file to create.

The pathname of the src file system is the same as the regular expression.

dest Save the target directory of the archive.

24, recursively copy files or directories

$ hadoop distcp

srcurl source Url

desturl target Url

4 Hadoop fsck

-------------------------------------------------- ------------------------------

25, run the HDFS file system check tool (fsck tools)

Usage: hadoop fsck [GENERIC_OPTIONS] [-move | -delete | -openforwrite] [- files [-blocks [-locations | -racks]]]

Command option description

Check the starting directory.

-move Move damaged files to / lost + found

-delete Delete damaged files.

-openforwrite Print out the open file.

-files Prints the file being checked.

-blocks prints a block info report.

-locations Prints the location information for each block.

-racks Print data-node network topology.

5 for Map Reduce job interaction command (jar)

-------------------------------------------------- ------------------------------

26, for interacting with MapReduce jobs and commands (jar)

Usage: hadoopjob [GENERIC_OPTIONS] [-submit] | [-status] | [-counter] | [-kill] | [-events <from-event - #> <# - of-events>] | [-history [all ]] | [-list [all]] | [-kill-task] | [-fail-task]

Command option description

-submit Submit job

-status print map and reduce the completion percentage and all counters.

-counter Print the counter value.

-kill Kill the specified job.

-events Prints the details of events received by jobtracker for a given range.

-history [all] -history Print job details, failures, and details of the cause of the kill. more

Details about a job such as a successful job, a done job attempt, etc. can be viewed by specifying the [all] option

-list [all] -list all Displays all jobs. -list shows only the jobs to be completed.

-kill-task Kill the task. The task of being killed will not be detrimental to failure attempts.

-fail-task Causes the task to fail. A failed mission can be detrimental to failure.

6 Run the pipes job

-------------------------------------------------- ------------------------------

27, run the pipes job

usage:

hadoop pipes [-conf] [-jobconf <key = value>, <key = value>, ...] [-input] [-output] [-jar] [-inputformat] [-map] [- partitioner] [ -reduce] [-writer] [-program] [-reduces]

Command option description

Config -conf configuration of the job

-jobconf <key = value>, <key = value>, ... Add / Overwrite job configuration items

-input input directory

-output output directory

-jar Jar file name

-inputformat InputFormat class

-map Java Map class

-partitioner Java Partitioner

-reduce Java Reduce class

-writer Java RecordWriter

-program The URI of the executable

Reduce the number of reduce

7 other orders

-------------------------------------------------- ------------------------------

28, print version information.

Usage: hadoop version

29, hadoop script can be used to call any class.

Usage: hadoop CLASSNAME

Run the class CLASSNAME.

30, run the cluster balance tool. Administrators can simply press Ctrl-C to stop the balancer (balancer)

Usage: hadoop balancer [-threshold]

Command option description

-threshold Percentage of disk capacity. This overrides the default threshold.

Get or set the daemonlog for each daemon.

Usage: hadoop daemonlog -getlevel

Usage: hadoop daemonlog -setlevel

Command option description

-getlevel Prints the log level of the daemon running at. Inside this command is http: // / logLevel? Log =

-setlevel Set the log level of the daemon running at. Inside this command is http: // / logLevel? Log =

32, run a HDFS datanode.

Usage: hadoop datanode [-rollback]

Command option description

-rollback Rolls back datanode to the previous version. This needs to be done after stopping the datanode, distributing the old hadoop version.

33, running a dfsadmin HDFS client.

Usage: hadoop dfsadmin [GENERIC_OPTIONS] [-report] [-safemode enter | leave | get | wait] [-refreshNodes] [-finalizeUpgrade] [-upgradeProgress status | details | force] [-metasave filename] [-setQuota ... ] [-clrQuota ...] [- help [cmd]]

Command option description

-report Report file system basic information and statistics.

-safemodeenter | leave | get | wait Security mode maintenance command. Security mode is a state of Namenode, in this state, Namenode: 1. Does not accept the namespace changes (read-only); 2. Do not copy or delete the block. Namenode automatically enters safe mode upon startup, and automatically leaves safe mode when the minimum percentage of configured blocks meets the minimum number of replicas. Safety mode can be manually entered, but in this case also must be manually shut down safe mode.

-refreshNodes Rereads the hosts and exclude files, updating the set of Datanodes that are allowed to connect to the Namenode or those that need to be quitted or edited.

-finalizeUpgrade Terminates HDFS upgrade. Datanode deletes the previous version of the working directory, after which Namenode does the same. This operation ends the upgrade process.

-upgradeProgressstatus | details | force Request current system upgrade status, status details, or force the upgrade operation.

-metasavefilename Saves the main Namenode data structure to the file under the directory specified by the hadoop.log.dir property. For each of the following, one line corresponds to it:

1.Namenode received Datanode heartbeat signal

Wait for the copied block

3. The block being copied

Wait for the deleted block

-setQuota ... Set a quota for each directory. Directory quota is a long integer, mandatory limit the number of names under the directory tree.

Command will work well in this directory, the following error will be reported:

1.N is not a positive integer, or

2. The user is not an administrator, either

This directory does not exist or is a file, or

4. The catalog will immediately exceed the new quota setting.

-clrQuota ... Clears the quota setting for each directory. Command will work well in this directory, the following error will be reported:

This directory does not exist or is a file, or

2. The user is not an administrator.

If the original catalog no quota will not be reported.

-help [cmd] Displays the help information for a given command. If no command is given, the help information for all commands is displayed.

34, run MapReducejob Tracker node (jobtracker).

Usage: hadoop jobtracker

35, run namenode. For more information on upgrading, rolling back, and upgrading, refer to Upgrading and Rolling Back.

Usage: hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint]

Command option description

-format Format namenode. It starts the namenode, formats the namenode, and then closes the namenode.

-upgrade After the new version of hadoop is distributed, the namenode should be started with the upgrade option.

-rollback Roll namenode back to the previous version. This option should be used after stopping the cluster and distributing the old hadoop version.

-finalizefinalize deletes the previous state of the file system. The most recent upgrade will be persisted, the rollback option will no longer be available, and after the end of the upgrade, it will stop the namenode.

-importCheckpoint Mount the image from the checkpoint directory and save it to the current checkpoint directory, specified by fs.checkpoint.dir.

36, running HDFS secondarynamenode.

Usage: hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize]

Command option description

-checkpoint [force] If CheckLog size> = fs.checkpoint.size, start the Secondarynamenode checkpoint process. If you use -force, will not consider EditLog size.

-geteditsize Print EditLog size.

37, run MapReduce taskTracker node.

Usage: hadoop tasktracker

8 summary

-------------------------------------------------- ------------------------------

This article is for reference only, the specific command line can be man.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.