Hadoop Command Manual

Source: Internet
Author: User
Keywords Name usage run this
Tags archive block checkpoint class common commands configuration counter default
Overview

All Hadoop commands are raised by the Bin/hadoop script. Do not specify parameters running the Hadoop script prints the description of all commands.

Usage: Hadoop [--config confdir] [COMMAND] [generic_options] [command_options]

Hadoop has an option parsing framework for parsing general options and running classes.

The

command option describes--config Confdir overrides the default configuration directory. The default is ${hadoop_home}/conf. Generic_options common options supported by multiple commands. COMMAND


command options s various commands and their options are mentioned below. These commands are divided into two sets of user command management commands. General Options

The following options are supported by Dfsadmin, FS, fsck, and job. The application implements tool to support general options.

Generic_option describes-conf <configuration file> the configuration file for the specified application. -D <property=value> Specify a value for the specified property. -FS <local|namenode:port> designated Namenode. -JT <local|jobtracker:port> Specify Job tracker. Applies only to job. -files < comma-delimited list of files > specifies a comma-delimited list of files to be copied to the map reduce cluster. Applies only to job. -libjars < comma-delimited jar list > Specifies a comma-delimited list of jar files to include in Classpath. Applies only to job. -archives < comma-delimited list of archive > Specifies a comma-separated list of files to be extracted to the compute node. Applies only to job. User command

Common commands for Hadoop cluster users.

Archive

Create a Hadoop profile. Refer to Hadoop Archives.

Usage: Hadoop archive-archivename NAME <src>* <dest>

The

command option describes the name of the file that-archivename name will create. The path name of the src file system, as usual with regular expressions. Dest the target directory where the file is saved. Distcp

Copy files or directories recursively. Refer to the DISTCP Guide for more information.

Usage: Hadoop distcp <srcurl> <desturl>

command option describes Srcurl source Urldesturl target Urlfs

Usage: Hadoop FS [generic_options] [command_options]

Run a regular file system client.

Various command options can refer to the HDFs Shell Guide.

fsck

Run the HDFs file System Check tool. Refer to Fsck for more information.

Usage: Hadoop fsck [generic_options] <path> [-move |-delete |-openforwrite] [-files [-blocks [-locations |-racks]]]

Command Options Description &lt;path&gt; check the starting directory. -move move the corrupted file to/lost+found-delete delete the corrupted file. -openforwrite Print out the write open file. -files print out the file being checked. -blocks Print out block Information report. -locations prints out the location information for each block. -racks prints out the Data-node network topology. Jar

Run the jar file. Users can bundle their map reduce code into a jar file and execute it using this command.

Usage: Hadoop jar <jar> [MainClass] args ...

The streaming job is executed through this command. Refer to the example in streaming examples.

The Word Count example is also run through the jar command. Refer to WordCount example.

Job

Used with map reduce job interaction and commands.

Usage: Hadoop job [generic_options] [-submit <job-file>] | [-status <job-id>] | [-counter <job-id> <group-name> <counter-name>] | [-kill <job-id>] | [-events <job-id> <from-event-#> <#-of-events>] | [-history [All] <joboutputdir>] | [-list [All]] | [-kill-task <task-id>] | [-fail-task <task-id>]

Command options describe-submit &lt;job-file&gt; submit jobs-status &lt;job-id&gt; print map and reduce percent complete and all counters. -counter &lt;job-id&gt; &lt;group-name&gt; &lt;counter-name&gt; Print counter values. -kill &lt;job-id&gt; Kill the specified job. -events &lt;job-id&gt; &lt;from-event-#&gt; &lt;#-of-events&gt; Prints the details of the events received jobtracker the given range. -history [All] &lt;joboutputdir&gt;-history &lt;jobOutputDir&gt; print job details, failures and reasons for being killed. More details about a job such as successful tasks, task attempts, etc. can be viewed by specifying the [all] option. -list [all]-list all displays all jobs. -list displays only the jobs that will be completed. -kill-task &lt;task-id&gt; Kill task. The task of being killed will not be detrimental to the failure attempt. -fail-task &lt;task-id&gt; failed the task. A failed task is bad for a failed attempt. Pipes

Run the pipes job.

Usage: Hadoop pipes [-conf <path>] [-jobconf <key=value>, <key=value>, ...] [-input <path>] [-output <path>] [-jar <jar File>] [-inputformat <class>] [-map <class>] [-partitioner <class>] [-reduce <class>] [-writer <class>] [-program <executable>] [-reduces <num>]

Command options describe-conf &lt;path&gt; job configuration-jobconf &lt;key=value&gt;, &lt;key=value&gt; Add/Overwrite job configuration items-input &lt;path&gt; input directory-output &lt;path&gt; output directory-jar &lt;jar file&gt;jar filename-inputformat &lt;class&gt; InputFormat class-map &lt;class&gt;java map class-partitioner &lt;class&gt;java partitioner-reduce Reduce class-writer &lt;class&gt;java recordwriter-program &lt;executable&gt; executable uri-reduces &lt;num&gt; Reduce number version

Print version information.

Usage: Hadoop version

CLASSNAME

Hadoop scripts can be used to tune any class.

Usage: Hadoop CLASSNAME

Run the class with the name classname.

Admin Command

Common commands for Hadoop cluster administrators.

Balancer

Run the cluster balancing tool. Administrators can simply press CTRL to stop the balancing process. Refer to Rebalancer for more information.

Usage: Hadoop balancer [-threshold <threshold>]

The

command option describes the percentage of-threshold &lt;threshold&gt; disk capacity. This overrides the default threshold. Daemonlog

Gets or sets the log level for each daemon.

Usage: Hadoop daemonlog-getlevel <host:port> <name>
Usage: Hadoop daemonlog-setlevel <host:port> <name> <level>

The

command option describes the log level of-getlevel &lt;host:port&gt; &lt;name&gt; print running in &lt;host:port&gt; daemon. This command will be connected internally http://&lt;host:port&gt;/logLevel?log=&lt;name&gt;-setlevel &lt;host:port&gt; &lt;name&gt; &lt;level&gt; Sets the log level for the daemon running in &lt;host:port&gt;. This command will connect Http://&lt;host:port&gt;/logLevel?log=&lt;name&gt;datanode

Run a HDFs datanode.

Usage: Hadoop datanode [-rollback]

command Option Description-rollback rolls Datanode back to the previous version. This needs to be used after stopping Datanode and distributing old versions of Hadoop. Dfsadmin

Run a HDFs dfsadmin client.

Usage: Hadoop dfsadmin [generic_options] [-report] [-safemode Enter | leave | get | wait] [-refreshnodes] [-finalizeupgrade] [- Upgradeprogress Status | Details | Force] [-metasave filename] [-setquota <quota> <dirname>...<dirname>] [-clrquota <dirname> <DIRNAME>] [-help [CMD]]

The command options describe the basic information and statistics for the-report reporting file system. -safemode Enter | Leave | Get | Wait Safe Mode maintenance command. Safe Mode is a state of Namenode, in which Namenode


1. Do not accept changes to namespaces (read only)


2. Do not copy or delete block


Namenode automatically enters Safe mode at startup and automatically leaves safe mode when the minimum percentage of the configured block meets the minimum number of replica conditions. Safe mode can be entered manually, but you must also manually turn off Safe mode. -refreshnodes re-read hosts and exclude files, updating the collection of Datanode that are allowed to be connected to namenode or those that need to exit or be made into. -finalizeupgrade End HDFs upgrade operation. Datanode deletes the previous version of the working directory, and Namenode does the same. This operation completes the entire upgrade process. -upgradeprogress Status | Details | Force requests the current system's upgrade status, details of the status, or a forced upgrade operation. -metasave filename Saves the primary data structure of Namenode to the &lt;filename&gt; file in the directory specified by the Hadoop.log.dir attribute. For each of the following,&lt;filename&gt;, a line of content corresponds to it


1. Namenode received a datanode heartbeat signal


2. Blocks waiting to be replicated


3. Blocks being replicated


4. Wait for deleted blocks-setquota &lt;quota&gt; &lt;dirname&gt;...&lt;dirname&gt; Set quota &lt;quota&gt; for each directory &lt;dirname&gt;. The directory quota is a long integer that enforces the number of names under the directory tree.

The
command will work well on this directory, and the following will be an error:


1. n is not a positive integer, or


2. The user is not an administrator, or


3. This directory does not exist or files, or


4. The catalog will immediately exceed the newly set quota. -clrquota &lt;dirname&gt;...&lt;dirname&gt; clears quota settings for each directory &lt;dirname&gt;.

The
command will work well on this directory, and the following will be an error:


1. This directory does not exist or file, or


2. The user is not an administrator.


If the directory does not have a quota, there is no error. -help [cmd] Displays help information for a given command, and displays help for all commands if no command is given. Jobtracker

Run the MapReduce Job tracker node.

Usage: Hadoop jobtracker

Namenode

Run Namenode. For more information about upgrades, rollbacks, and upgrades, please refer to the upgrade and rollback.

Usage: Hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importcheckpoint]

The

command option describes the-format format Namenode. It starts Namenode, formats the Namenode, and then closes the Namenode. After-upgrade distributes a new version of Hadoop, Namenode should start with the upgrade option. -rollback rolls back Namenode to the previous version. This option is used after stopping the cluster and distributing the old version of Hadoop. -finalizefinalize deletes the previous state of the file system. The most recent upgrade will be persisted, the rollback option will not be available, and after the finalization operation is upgraded, it will stop Namenode. -importcheckpoint loads the mirror from the checkpoint directory and saves it to the current checkpoint directory, which is specified by Fs.checkpoint.dir. Secondarynamenode

Run HDFs secondary Namenode. Refer to secondary Namenode for more information.

Usage: Hadoop secondarynamenode [-checkpoint [Force]] | [-geteditsize]

The

command option describes-checkpoint [force] if the Editlog size &gt;= fs.checkpoint.size, initiates the secondary Namenode checkpoint process. If-force is used, the size of the editlog is not considered. -geteditsize print Editlog size. Tasktracker

Run the MapReduce Task tracker node.

Usage: Hadoop tasktracker

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.