Hadoop Process Initiation Process Analysis

Source: Internet
Author: User
Tags ssh
Detailed procedures for starting the HDFS process using start-dfs.sh


The scripts involved are:


Under Bin:
hadoop-config.sh
start-dfs.sh
hadoop-daemons.sh
slaves.sh
hadoop-daemon.sh
Hadoop


Conf under:
hadoop-env.sh


Where both hadoop-config.sh and hadoop-env.sh are scripts related to the Hadoop environment variables.


Start-dfs.sh will call hadoop-daemon.sh start Naomenode, call hadoop-daemons.sh start Secondarynamenode, call hadoop-daemons.sh Start Datanode, because the process of starting datanode is needed to start on more than one node, so here only to analyze it, the other two process start is relatively simple, reference it can be understood.


start-dfs.sh: Load hadoop-config.sh, which is source hadoop-config.sh
|
|
|
hadoop-daemons.sh: Loading hadoop-config.sh; The script runs on the master node.
| Call slaves.sh and pass parameters: Exec "$bin/slaves.sh"--config $HADOOP _conf_dir cd "$HADOOP _home" \; "$bin/hadoop-daemon.sh"--config $HADOOP _conf_dir "$@"
| Here's the EXEC command: EXEC < scripts > < parameters >
| For example: Execute the following command to start Datanode
| EXEC ***/test_slaves.sh--config ***/. /conf CD ***/. '; ' ***/hadoop-daemon.sh--config ***/. /conf Start Datanode
| where the * * =/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin
|
slaves.sh: Loading hadoop-config.sh; The script runs on the master node, which starts all the hadoop-daemon.sh processes on the node and lets them run in parallel.
| Traverse each host in the slaves file, execute the SSH command sequentially, and the SSH command will get the relevant parameters of the shell script passed in the hadoop-daemons.sh and start the related process on each remote node
|
| For slave in ' cat ' $HOSTLIST | Sed "s/#.*$//;/^$/d" '
| Do
| SSH $HADOOP _ssh_opts $slave $ "${@///\\}" 2>&1 | Sed "s/^/$slave:/" &
| For example, the command is in the following form
| SSH hdfs05 CD ***/. '; ' ***/hadoop-daemon.sh--config ***/. /conf Start Datanode 2>&1 | Sed ' s/^/hdfs05:/' &
| The command will be executed on node Hdfs05: CD ***/. '; ' ***/hadoop-daemon.sh--config ***/. /conf Start Datanode 2>&1 | Sed ' s/^/hdfs05:/' &
| This command will be run from the background of the node, so that the script for all nodes can be started on the master node, and each node script executes in parallel.
| The script called hadoop-daemon.sh.
| Done
|
hadoop-daemon.sh: Load hadoop-config.sh, load hadoop-env.sh, set Hadoop related variables and the current Shell's Java environment variables,
| The script runs on the slave node, and in this case it is started by the master node and runs on its own, and if the script is to be tested, it needs to be tested independently on a node.
| The start and stop of the process are handled here,
| Stop is the kill tune related process, such as datanode, relatively simple,
| Here, according to the parameters passed in the slaves.sh, is the start Datanode,
| When the process starts, it executes the following command, which executes in the background of the current node.
| Nohup nice-n $HADOOP _niceness "$HADOOP _home"/bin/hadoop--config $HADOOP _conf_dir $command "$@" > "$log" 2>&1 & Lt /dev/null &
| For example, the following command may be executed here, the command will start the Hadoop script on this node, and will run in the background, the standard output in $log,
| Users can use Nohup nice-n 0 ***/directly when testing. /bin/hadoop--config ***/. /conf Datanode
| Or use Hadoop datanode directly, use Nohup and & just put it in the background, and do not hang up the run, do not occupy the current shell, see Nohup command
| Nohup nice-n 0 ***/. /bin/hadoop--config ***/. /conf datanode > "$log" 2>&1 </dev/null &
| where the * * =/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin,
|
Hadoop: Load hadoop-config.sh, load hadoop-env.sh, set Hadoop related variables and the current Shell's Java environment variables, according to the parameters passed earlier Datanode find the relevant Java class, Start the Datanode process.
| The script is run from the background of the node, and all standard output and standard error information is written to the $log because of the reason for the command in Hadoop-daemon.sh,
| Therefore, when the user needs to test the script separately, it needs to be tested separately, such as using the command Hadoop datanode to test the startup Datanode.
| Find the relevant Java class class= ' Org.apache.hadoop.hdfs.server.datanode.DataNode ' through the parameter Datanode, and set the Java environment variable classpath, The JVM occupies the largest amount of memory and executes Java commands to run
| Datanode class;
| The script eventually starts the process by executing the following command, such as Datanode
Complete the Exec "$JAVA" $JAVA _heap_max $HADOOP _opts-classpath "$CLASSPATH" $CLASS "$@"
The command may be in the following form:


Exec


/home/hadoop/jdk1.6.0_07/bin/java java command


-xmx1000m JVM occupies the maximum memory space


-dcom.sun.management.jmxremote-dhadoop.log.dir=/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /logs-dhadoop.log.file=hadoop.log-dhadoop.home.dir=/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. -dhadoop.id.str=-DHADOOP.ROOT.LOGGER=INFO,CONSOLE-DJAVA.LIBRARY.PATH=/ROOT/HADOOP/HADOOP-0.20.1/HADOOP-0.20.1/ bin/. /lib/native/linux-amd64-64-dhadoop.policy.file=hadoop-policy.xml


Hadoop Options $hadoop_opts


-classpath/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /conf:/home/hadoop/jdk1.6.0_07/lib/tools.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/..:/ root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /hadoop-0.20.1-core.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/commons-cli-1.2.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/commons-codec-1.3.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/commons-el-1.0.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/commons-httpclient-3.0.1.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/commons-logging-1.0.4.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/commons-logging-api-1.0.4.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/commons-net-1.4.1.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/core-3.1.1.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/hsqldb-1.8.0.10.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/jasper-compiler-5.5.12.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/jasper-runtime-5.5.12.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/jets3t-0.6.1.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/jetty-6.1.14.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/jetty-util-6.1.14.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/junit-3.8.1.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/kfs-0.2.2.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/log4j-1.2.15.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/oro-2.0.8.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/servlet-api-2.5-6.1.14.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/slf4j-api-1.4.3.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/slf4j-log4j12-1.4.3.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/xmlenc-0.52.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/jsp-2.1/jsp-2.1.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/bin/. /lib/jsp-2.1/jsp-api-2.1.jar 


This is the jar package in the Java environment variable Classpath,datanode class runtime dependency variable


Org.apache.hadoop.hdfs.server.datanode.DataNode


This is the final Java class that starts the Datanode process









Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.