Hadoop Basic Operations Command Encyclopedia

Source: Internet
Author: User
Keywords Name dfs running quotas

Start Hadoop

start-all.sh

Turn off Hadoop

stop-all.sh

View File List

View the files in the/user/admin/aaron directory in HDFs.

Hadoop Fs-ls/user/admin/aaron

Lists all files (including files under subdirectories) in the/user/admin/aaron directory in HDFs.

Hadoop Fs-lsr/user/admin/aaron

Create a file directory

Hadoop Fs-mkdir/user/admin/aaron/newdir

deleting files

Delete a file named Needdelete in the/user/admin/aaron directory in HDFs

Hadoop Fs-rm/user/admin/aaron/needdelete

Delete the/user/admin/aaron directory in HDFs and all files in the directory

Hadoop Fs-rmr/user/admin/aaron

Uploading files

Hadoop fs–put/home/admin/newfile/user/admin/aaron/

Download files

Hadoop fs–get/user/admin/aaron/newfile/home/admin/newfile

View Files

Hadoop fs–cat/home/admin/newfile

Create a new empty file

Hadoop fs-touchz/user/new.txt

Renaming a file on Hadoop

Hadoop fs–mv/user/test.txt/user/ok.txt

Save all content in a Hadoop-specified directory as a file and down to local

Hadoop dfs–getmerge/user/home/t

Submit MapReduce JOB

H Bin/hadoop Jar/home/admin/hadoop/job.jar [Jobmainclass] [Jobargs]

Kill a running job

Hadoop Job-kill job_201005310937_0053

More Hadoop commands

Hadoop

You can see more instructions for the command:

Namenode-format Format the DFS filesystem

Secondarynamenode Run the DFS secondary Namenode

Namenode Run the DFS namenode

Datanode Run a DFS datanode

Dfsadmin Run a DFS admin client

Fsck Run a DFS filesystem checking utility

FS Run a generic filesystem user Client

Balancer run a cluster balancing utility

Jobtracker run the MapReduce job Tracker node

Pipes run a pipes job

Tasktracker Run a MapReduce task Tracker node

Job Manipulate MapReduce Jobs

Queue get information regarding Jobqueues

Version Print the version

Jar Run a JAR file

DISTCP copy file or directories recursively

Archive-archivename NAME * Create a Hadoop archive

Daemonlog nonattached The log level for each daemon

Or

CLASSNAME run the class named CLASSNAME

Most commands Print Help when invoked W parameters.

Description:

1. List all the commands that the Hadoop shell supports

$bin/hadoop Fs-help

2. Display detailed information about a command

$bin/hadoop fs-help Command-name

3, the user can use the following command to view the history log rollup under the specified path, display details of the job, failure and termination of the task details

$bin/hadoop job-history Output-dir

4. More details about the job, such as the successful task, and the number of attempts on each task can be viewed with the following command

$bin/hadoop job-history All Output-dir

5, the format of a new Distributed file system

$bin/hadoop Namenode-format

6, on the allocation of Namenode, run the following command to start HDFs, all listed slave start the Datanode daemon

$bin/start-dfs.sh

7, on the allocation of Jobtracker, run the following command to start Map/reduce

$bin/start-mapred.sh

8. On the assigned Namenode, execute the following command to stop HDFs:

$bin/stop-dfs.sh

9. On the assigned Jobtracker, run the following command to stop Map/reduce:

$bin/stop-mapred.sh

Dfsshell

10. Create a directory named/foodir

$bin/hadoop Dfs-mkdir/foodir

11. Create a directory named/foodir

$bin/hadoop Dfs-mkdir/foodir

12. View the contents of the file named/foodir/myfile.txt

$bin/hadoop Dfs-cat/foodir/myfile.txt

Dfsadmin

13. Put the cluster in safe mode

$bin/hadoop Dfsadmin-safemode Enter

14, display Datanode list

$bin/hadoop Dfsadmin-report

15, make Datanode node Datanodename retired

$bin/hadoop dfsadmin-decommission Datanodename

16. The Bin/hadoop dfsadmin-help command lists all currently supported commands. Like what:

*-report: Reports basic statistics for HDFS. Some information can be found on the Namenode Web Services homepage.

*-safemode: It is not usually necessary for an administrator to manually allow Namenode to enter or leave safe mode.

*-finalizeupgrade: Deletes the cluster backup that was made during the last upgrade.

17, explicitly put HDFs in safe mode

$bin/hadoop Dfsadmin-safemode

18. Before upgrading, administrators need to use (upgrade finalization operation) command to delete existing backup files

$bin/hadoop Dfsadmin-finalizeupgrade

19. Ability to know if you need to perform an upgrade finalization operation on a cluster.

$dfsadmin-upgradeprogress Status

20. Run the new version with the-upgrade option

$bin/start-dfs.sh-upgrade

21, need to return to the old version, you must stop the cluster and deploy the old version of Hadoop, with the rollback option to start the cluster

$bin/start-dfs.h-rollback

22. The following new command or new option is used to support quotas. The first two are administrator commands.

*dfsadmin-setquota ...

Set each directory quota to N. This command is tried on each directory, and if n is not a positive long integer, the directory does not exist or the file name is exceeded, or the directory exceeds the quota, an error report is generated.

*dfsadmin-clrquota ...

Remove quotas for each directory. This command is tried on each directory, and if the directory does not exist or is a file, an error report is generated. If the directory originally did not set the quota does not error.

*fs-count-q ...

Use the-Q option to report the quota set for each directory and the remaining quotas. If the directory does not have quotas set, the none and INF are reported.

23. Create a Hadoop file

$hadoop Archive-archivename NAME *

-archivename name of the file to create.

The path name of the src file system, as usual with regular expressions.

Dest the target directory where the file is saved.

24. Recursively copy files or directories

$hadoop distcp

Srcurl Source URL

Desturl Target URL

25. Run HDFs File System Check tool (fsck tools)

Usage: hadoopfsck [generic_options] [-move |-delete |-openforwrite] [-files[-blocks [-locations |-racks]]]

Command option Description

The starting directory for the check.

-move move damaged files to/lost+found

-delete Delete the corrupted file.

-openforwrite Print out the write open file.

-files print out the file being checked.

-blocks Print out block Information report.

-locations prints out the location information for each block.

-racks prints out the Data-node network topology.

26, for and map reduce job interaction and command (JAR)

Usage: hadoopjob [generic_options] [-submit] | [-status] | [-counter] | [-kill] | [-events <from-event-#><#-of-events>] | [-history [All]] | [-list [All]] | [-kill-task] | [-fail-task]

Command option Description

-submit Submit Job

-status print map and reduce percent complete and all counters.

-counter the value of the print counter.

-kill kill the specified job.

-events <from-event-#><#-of-events> Prints the details of the events received jobtracker the given range.

-history [All]-history details of the print job, the failure, and the reason for being killed. More details about a job like

Successful tasks, task attempts, etc. can be viewed by specifying the [all] option.

-list [All]-listall shows all jobs. -list displays only the jobs that will be completed.

-kill-task Kill the task. The task of being killed will not be detrimental to the failure attempt.

-fail-task failed the task. A failed task is bad for a failed attempt.

27. Run Pipes operation

Usage: hadooppipes [-conf] [-jobconf <key=value>, <key=value>, ...] [-input] [-output] [-jar] [-inputformat] [-map] [-partitioner] [-reduce] [-writer] [-program] [-reduces]

Command option Description

-conf Job Configuration

-jobconf<key=value> <key=value> Add/Overwrite Job configuration items

-input Input Directory

-output Output Directory

-jar jar filename

-inputformat InputFormat Class

-map Javamap Class

-partitioner Javapartitioner

-reduce Javareduce Class

-writer Javarecordwriter

-program the URI of the executable program

-reduces Reduce number

28. Print version information.

Usage: Hadoop version

29. Hadoop scripts can be used to tune any class.

Usage: Hadoop CLASSNAME

Run the class with the name classname.

30, run the cluster balance tool. The administrator can simply press CTRL to stop the balancing process (balancer)

Usage: Hadoop balancer [-threshold]

Command option Description

-threshold the percentage of disk capacity. This overrides the default threshold.

31. Gets or sets the log level (Daemonlog) for each daemon.

Usage: hadoopdaemonlog-getlevel

Usage: hadoopdaemonlog-setlevel

Command option Description

-getlevel Print the log level of the daemon that is running in. This command will be connected internally http:///logLevel?log=

-setlevel sets the log level of the daemon that is running. This command will be connected internally http:///logLevel?log=

32. Run a HDFs datanode.

Usage: hadoopdatanode [-rollback]

Command option Description

-rollback rolls back Datanode to the previous version. This needs to be used after stopping Datanode and distributing old versions of Hadoop.

33. Run a HDFs dfsadmin client.

Usage: hadoopdfsadmin [generic_options] [-report] [-safemode Enter | leave | get | wait][-refreshnodes] [-FINALIZEUPGRADE] [- Upgradeprogress Status | Details | Force][-metasave filename] [-setquota ...] [-clrquota ...] [-help [CMD]]

Command option Description

-report reports the basic information and statistics of the file system.

-safemode Enter |leave | Get | Wait Safe Mode maintenance command. Safe Mode is a state of Namenode, in which Namenode

1. Do not accept changes to namespace (read only)

2. Do not copy or delete blocks

Namenode automatically enters Safe mode at startup, leaving Safe mode automatically when the minimum percentage of the configured block meets the minimum number of replica conditions. Safe mode can be entered manually, but you must also manually turn off Safe mode.

-refreshnodes re-read hosts and exclude files, updating the collection of Datanode that are allowed to be connected to namenode or those that need to exit or be made into.

-finalizeupgrade End HDFs upgrade operation. Datanode deletes the previous version of the working directory, and Namenode does the same. This operation completes the entire upgrade process.

-upgradeprogressstatus | Details | Force requests the current system's upgrade status, details of the status, or a forced upgrade operation.

-metasave filename Saves the primary data structure of the Namenode to the file in the directory specified by the Hadoop.log.dir property. For each of the following,

In the same line.

1. Namenode received Datanode heartbeat signal

2. Waiting for the copied block

3. Blocks being copied

4. Blocks waiting to be deleted

-setquota ... Set quotas for each directory. The directory quota is a long integer that enforces the number of names under the directory tree.

The command will work well on this directory, and the following will be an error:

1. N is not a positive integer, or

2. The user is not an administrator, or

3. This directory does not exist or files, or

4. The catalogue will immediately exceed the newly set quota.

-clrquota ... Clear quota settings for each directory.

The command will work well on this directory, and the following will be an error:

1. This directory does not exist or files, or

2. The user is not an administrator.

If the directory does not have a quota, there is no error.

-help [cmd] Displays help information for a given command, and displays help for all commands if no command is given.

34, run MapReduce Job Tracker node (jobtracker).

Usage: Hadoopjobtracker

35. Run Namenode. For more information about upgrades, rollbacks, and upgrades, please refer to the upgrade and rollback.

Usage: hadoopnamenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importcheckpoint]

Command option Description

-format format Namenode. It starts Namenode, formats the Namenode, and then closes the Namenode.

After-upgrade distributes a new version of Hadoop, Namenode should start with the upgrade option.

-rollback rolls back Namenode to the previous version. This option is used after stopping the cluster and distributing the old version of Hadoop.

-finalize Finalize deletes the previous state of the file system. The most recent upgrade will be persisted, and the rollback option will

is not available, it will stop namenode after you upgrade the finalization operation.

-importcheckpoint loads the mirror from the checkpoint directory and saves it to the current checkpoint directory, which is specified by Fs.checkpoint.dir.

36, running HDFs secondary namenode.

Usage: hadoopsecondarynamenode [-checkpoint [Force]] | [-geteditsize]

Command option Description

-checkpoint [Force] if the Editlog size >= fs.checkpoint.size, start the secondary Namenode checkpoint

Process。 If-force is used, the size of the editlog is not considered.

-geteditsize print Editlog size.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.