Hadoop command Overview

Last Update:2018-12-04 Source: Internet

Author: User

Tags map class hadoop fs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. List all commands supported by hadoop Shell
$ Bin/hadoop FS-help
2. display detailed information about a command
$ Bin/hadoop FS-HELP command-name
3. You can use the following command to view the historical log summary in the specified path.
$ Bin/hadoop job-history output-Dir
This command displays the job details, details of failed and terminated tasks.
4. For more details about the job, such as the successful task and the number of attempts made to each task, run the following command.
$ Bin/hadoop job-history all output-Dir
5. Format a New Distributed File System:
$ Bin/hadoop namenode-format
6. On the assigned namenode, run the following command to start HDFS:
$ Bin/start-dfs.sh
The bin/start-dfs.sh script starts the datanode daemon on all listed slave instances with reference to the content of the $ {hadoop_conf_dir}/slaves file on namenode.
7. On the assigned jobtracker, run the following command to start MAP/reduce:
$ Bin/start-mapred.sh
The bin/start-mapred.sh script starts the tasktracker daemon on all listed slave instances with reference to the content of the $ {hadoop_conf_dir}/slaves file on jobtracker.
8. On the assigned namenode, run the following command to stop HDFS:
$ Bin/stop-dfs.sh
The bin/stop-dfs.sh script stops the datanode daemon on all listed slave instances by referring to the content of the $ {hadoop_conf_dir}/slaves file on namenode.
9. On the assigned jobtracker, run the following command to stop MAP/reduce:
$ Bin/stop-mapred.sh
The bin/stop-mapred.sh script stops the tasktracker daemon on all listed slave instances by referring to the content of the $ {hadoop_conf_dir}/slaves file on jobtracker.

Dfsshell
10. Create a directory named/foodir.
$ Bin/hadoop DFS-mkdir/foodir
11. Create a directory named/foodir
$ Bin/hadoop DFS-mkdir/foodir
12. view the file content named/foodir/myfile.txt.
$ Bin/hadoop DFS-CAT/foodir/myfile.txt

Dfsadmin
13. Place the cluster in Security Mode
$ Bin/hadoop dfsadmin-safemode enter
14. display the datanode list
$ Bin/hadoop dfsadmin-Report
15. Retire datanode node datanodename
$ Bin/hadoop dfsadmin-decommission datanodename
16. The bin/hadoop dfsadmin-HELP command can list all currently supported commands. For example:
*-Report: Reports basic HDFS statistics. Some information can also be seen on the namenode Web Service Homepage.
*-Safemode: Although not required, the administrator can manually enter or exit the security mode.
*-Finalizeupgrade: deletes the cluster backup created during the last upgrade.
17. explicitly place HDFS in Security Mode
$ Bin/hadoop dfsadmin-safemode
18. Before the upgrade, the administrator needs to use the (upgrade termination operation) command to delete the existing backup files.
$ Bin/hadoop dfsadmin-finalizeupgrade
19. You need to know whether to perform the upgrade termination operation on a cluster.
$ Dfsadmin-upgradeprogress status
20. Use the-upgrade option to run the new version.
$ Bin/start-dfs.sh-Upgrade
21. If you need to return to the old version, you must stop the cluster and deploy hadoop of the old version. Use the rollback option to start the cluster.
$ Bin/start-dfs.h-rollback
22. The following new commands or options are used to support quotas. The first two are administrator commands.
* Dfsadmin-setquota <n> <directory>... <directory>
Set the quota of each directory to n. This command will try on each directory. If n is not a positive long integer, the directory does not exist or the file name, or the directory exceeds the quota, an error report will be generated.
* Dfsadmin-clrquota <directory>... <Director>
Delete a quota for each directory. This command will try on each directory. If the directory does not exist or is a file, an error report will be generated. If no quota is set in the directory, no error is returned.
* FS-count-q <directory>... <directory>
The-Q option is used to report the quota set for each directory and the remaining quota. If no quota is set for the directory, none and INF are reported.
23. Create a hadoop archive file
$ Hadoop archive-archivename name <SRC> * <DEST>
-Archivename name: name of the file to be created.
The path name of the SRC file system, which is the same as the regular expression.
The target directory where DEST saves the file.
24. recursively copy files or directories
$ Hadoop distcp <srcurl> <desturl>
Srcurl source URL
Desturl target URL

25. Run the HDFS File System Check Tool (fsck tools)
Usage: hadoop fsck [generic_options] <path> [-move |-delete |-openforwrite] [-files [-blocks [-locations |-racks]
Command Option description
<Path> Check Start directory.
-Move the damaged file to/lost + found
-Delete: Delete the damaged file.
-Openforwrite: prints the files opened by the write operation.
-Files: prints the file being checked.
-Blocks: print the block information report.
-Locations: prints the location information of each block.
-Racks prints the network topology of the data-node.
26. Used to interact with map reduce jobs and commands (jar)
Usage: hadoop job [generic_options] [-submit <job-File>] | [-status <job-ID>] | [-counter <job-ID> <group-Name> <counter -Name>] | [-kill <job-ID>] | [-events <job-ID> <from-event-#> <#-of-events>] | [-History [all] <joboutputdir>] | [-list [all] | [-kill-task <task-ID>] | [-fail-task <task-ID>]
Command Option description
-Submit <job-File>: Submit a job.
-Status <job-ID>: print the percentage of MAP and reduce tasks and all counters.
-Counter <job-ID> <group-Name> <counter-Name> prints the counter value.
-Kill <job-ID> to kill the specified job.
-Events <job-ID> <from-event-#> <#-of-events> prints details of events received by jobtracker within a specified range.
-History [all] <joboutputdir>-history <joboutputdir> prints details about the job, failure, and cause of the task being killed. More details about a job, such
You can specify the [all] Option to view successful tasks, task attempts, and other information.
-List [all]-list all: displays all jobs. -List: only jobs to be completed are displayed.
-Kill-task <task-ID>: Kill the task. The killed task is not conducive to failed attempts.
-Fail-task <task-ID>: The task fails. Failed tasks are not conducive to failed attempts.
27. Run the pipes job
Usage: hadoop pipes [-conf <path>] [-jobconf <key = value>, <key = value>,...] [-input <path>] [-output <path>] [-jar <JAR File>] [-inputformat <class>] [-Map <class>] [-partitioner <class>] [-Reduce <class>] [-writer <class>] [-Program <executable>] [-reduces <num>]
Command Option description
-Conf <path> job Configuration
-Jobconf <key = value>, <key = value>,... adds/overwrites the configuration items of a job.
-Input <path> input directory
-Output <path> output directory
-Jar <JAR File> JAR file name
-Inputformat <class> inputformat class
-Map <class> JAVA map class
-Partitioner <class> JAVA partitioner
-Reduce <class> JAVA reduce class
-Writer <class> JAVA recordwriter
-Program <executable> URI of the executable program
-Reduces <num> reduce count
28. Print the version information.
Usage: hadoop version
29. hadoop scripts can be used to call any class.
Usage: hadoop classname
Class whose name is classname.
30. Run the cluster balancing tool. The administrator can simply press Ctrl-C to stop the balancer process)
Usage: hadoop balancer [-threshold <threshold>]
Command Option description
-Percentage of threshold <threshold> disk capacity. This overwrites the default threshold.
31. Obtain or set the log level (daemonlog) of each daemonprocess ).
Usage: hadoop daemonlog-getlevel <HOST: Port> <Name>
Usage: hadoop daemonlog-setlevel <HOST: Port> <Name> <level>
Command Option description
-Getlevel <HOST: Port> <Name> prints the Log Level of the daemon running on <HOST: Port>. This command will connect to http: // <HOST: Port>/loglevel? Log = <Name>
-Setlevel <HOST: Port> <Name> <level> sets the Log Level of the daemon running on <HOST: Port>. This command will connect to http: // <HOST: Port>/loglevel? Log = <Name>
32. Run a HDFS datanode.
Usage: hadoop datanode [-rollback]
Command Option description
-Rollback: roll back the datanode to the previous version. This must be used after datanode is stopped and old hadoop versions are distributed.
33. Run an HDFS dfsadmin client.
Usage: hadoop dfsadmin [generic_options] [-report] [-safemode enter | leave | GET | Wait] [-refreshnodes] [-finalizeupgrade] [-upgradeprogress status | details | force] [-metasave filename] [-setquota <quota> <dirname>... <dirname>] [-clrquota <dirname>... <dirname>] [-help [cmd]
Command Option description
-Report the basic information and statistical information of the file system.
-Safemode enter | leave | GET | wait security mode maintenance command. Security mode is a state of namenode. in this state, namenode
1. Changes to the namespace are not allowed (read-only)
2. Do not copy or delete Blocks
Namenode automatically enters the security mode when it is started. When the minimum block percentage configured meets the minimum number of replicas, the security mode is automatically removed. You can enter security mode manually, but you must also disable security mode manually.
-Refreshnodes: reads the hosts and exclude files again, and updates the set of datanode that can be connected to namenode or that needs to exit or be compiled.
-Finalizeupgrade: ends the HDFS upgrade operation. Datanode deletes the working directory of the previous version, and then namenode does the same. This operation completes the entire upgrade process.
-Upgradeprogress status | details | force requests the upgrade status and status details of the current system, or forces the upgrade operation.
-Metasave filename stores the main data structure of namenode to the <FILENAME> file in the directory specified by hadoop. log. dir. For each of the following items,
<FILENAME> corresponds to a line of content.
1. The heartbeat signal of the datanode received by namenode
2. Block waiting for replication
3. The block being copied
4. Blocks awaiting Deletion
-Setquota <quota> <dirname>... <dirname> sets a quota for each directory <dirname> <quota>. The directory quota is a long integer that limits the number of names under the directory tree.
The command will work well on this directory, and an error will be reported in the following cases:
1. N is not a positive integer, or
2. the user is not an administrator, or
3. This directory does not exist or is a file, or
4. The directory will immediately exceed the new quota.
-Clrquota <dirname>... <dirname> clears the quota for each directory.
The command will work well on this directory, and an error will be reported in the following cases:
1. This directory does not exist or is a file, or
2. the user is not an administrator.
If the directory does not have a quota, no error is reported.
-Help [cmd] displays the help information of a given command. If no command is provided, the help information of all commands is displayed.
34. Run the mapreduce job tracker node (jobtracker ).
Usage: hadoop jobtracker
35. Run namenode. For more information about upgrade, rollback, and upgrade termination, see upgrade and rollback.
Usage: hadoop namenode [-format] | [-upgrade] | [-rollback] | [-Finalize] | [-importcheckpoint]
Command Option description
-Format: Format namenode. It starts namenode, formats namenode, and then disables namenode.
-After Upgrade delivers a new version of hadoop, namenode should be started with the upgrade option.
-Rollback: roll back the namenode to the previous version. This option should be used after the cluster is stopped and old hadoop versions are distributed.
-Finalize deletes the previous state of the file system. The latest upgrade will be persistent, and the rollback option will
Unavailable. After the upgrade ends, it will stop namenode.
-Importcheckpoint: loads images from the checkpoint directory and saves them to the current checkpoint directory. The checkpoint directory is specified by fs. Checkpoint. dir.
36. Run the secondary namenode of HDFS.
Usage: hadoop secondarynamenode [-checkpoint [force] | [-geteditsize]
Command Option description
-Checkpoint [force] If the editlog size is greater than or equal to FS. Checkpoint. Size, start the secondary namenode checkpoint.
Process. If-force is used, the size of the editlog is not taken into account.
-Geteditsize: print the editlog size.
37. Task tracker node running mapreduce.
Usage: hadoop tasktracker
1. List all commands supported by hadoop Shell
$ Bin/hadoop FS-help
2. display detailed information about a command
$ Bin/hadoop FS-HELP command-name
3. You can use the following command to view the historical log summary in the specified path.
$ Bin/hadoop job-history output-Dir
This command displays the job details, details of failed and terminated tasks.
4. For more details about the job, such as the successful task and the number of attempts made to each task, run the following command.
$ Bin/hadoop job-history all output-Dir
5. Format a New Distributed File System:
$ Bin/hadoop namenode-format
6. On the assigned namenode, run the following command to start HDFS:
$ Bin/start-dfs.sh
The bin/start-dfs.sh script starts the datanode daemon on all listed slave instances with reference to the content of the $ {hadoop_conf_dir}/slaves file on namenode.
7. On the assigned jobtracker, run the following command to start MAP/reduce:
$ Bin/start-mapred.sh
The bin/start-mapred.sh script starts the tasktracker daemon on all listed slave instances with reference to the content of the $ {hadoop_conf_dir}/slaves file on jobtracker.
8. On the assigned namenode, run the following command to stop HDFS:
$ Bin/stop-dfs.sh
The bin/stop-dfs.sh script stops the datanode daemon on all listed slave instances by referring to the content of the $ {hadoop_conf_dir}/slaves file on namenode.
9. On the assigned jobtracker, run the following command to stop MAP/reduce:
$ Bin/stop-mapred.sh
The bin/stop-mapred.sh script stops the tasktracker daemon on all listed slave instances by referring to the content of the $ {hadoop_conf_dir}/slaves file on jobtracker.

Dfsshell
10. Create a directory named/foodir.
$ Bin/hadoop DFS-mkdir/foodir
11. Create a directory named/foodir
$ Bin/hadoop DFS-mkdir/foodir
12. view the file content named/foodir/myfile.txt.
$ Bin/hadoop DFS-CAT/foodir/myfile.txt

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More