Spark personal practice series (2) -- spark service script analysis

Source: Internet
Author: User

Tag: blog http OS file 2014 Art

Preface:

Spark has been very popular recently. This article does not talk about spark principles, but studies how to compile spark cluster construction and service scripts. We hope to understand spark clusters from the perspective of running scripts. spark is 1.0.1. spark clusters are built in standalone mode. The basic architecture is master-slave (worker mode), which is composed of a single master node and multiple slave (worker) nodes.

Script directory

Role of start-all.sh: Start the entire cluster
Stop-all.sh: shutting down the entire cluster
Start-master.sh role: Start master nodes
Stop-master function: Disable the master node
Start-slaves.sh role: Start the slave node of the entire cluster
Start-slave.sh role: Start a single-node slave

The dependency graph of the overall script is as follows:

*) Analysis script start-all.sh

# Load the Spark configuration. "$sbin/spark-config.sh"# Start Master"$sbin"/start-master.sh $TACHYON_STR# Start Workers"$sbin"/start-slaves.sh $TACHYON_STR

Comments:
#1. Load execute spark-config.sh
#2. Start the master node
#3. Start each slave (worker) node

*) First study the sbin/spark-config.sh script

export SPARK_PREFIX=`dirname "$this"`/..export SPARK_HOME=${SPARK_PREFIX}export SPARK_CONF_DIR="$SPARK_HOME/conf"

Comments:
# The role of spark-config.sh is the common environment variables spark_home, spark_conf_dir Export

*) Script start-master.sh Analysis

. "$sbin/spark-config.sh" . "$SPARK_PREFIX/bin/load-spark-env.sh""$sbin"/spark-daemon.sh start org.apache.spark.deploy.master.Master 1 --ip $SPARK_MASTER_IP --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT

Comments:
# Source spark-config.sh, after load-spark-env.sh
# Start the master service with spark-daemon.sh script, and pass in relevant parameters, Master binding IP/port, and webui Port

*) Interpret the load-spark-env.sh script

if [ -z "$SPARK_ENV_LOADED" ]; then  export SPARK_ENV_LOADED=1  # Returns the parent of the directory this script lives in.  parent_dir="$(cd `dirname $0`/..; pwd)"  use_conf_dir=${SPARK_CONF_DIR:-"$parent_dir/conf"}  if [ -f "${use_conf_dir}/spark-env.sh" ]; then    # Promote all variable declarations to environment (exported) variables    set -a    . "${use_conf_dir}/spark-env.sh"    set +a  fifi

Comments:
# An important step is to import the conf/spark-env.sh, take all user-defined variable parameters into effect, replace the default value

*) Analysis of start-slaves.sh

# Launch the slavesif [ "$SPARK_WORKER_INSTANCES" = "" ]; then  exec "$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin/start-slave.sh" 1 spark://$SPARK_MASTER_IP:$SPARK_MASTER_PORTelse  if [ "$SPARK_WORKER_WEBUI_PORT" = "" ]; then    SPARK_WORKER_WEBUI_PORT=8081  fi  for ((i=0; i<$SPARK_WORKER_INSTANCES; i++)); do    "$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin/start-slave.sh" $(( $i + 1 ))     spark://$SPARK_MASTER_IP:$SPARK_MASTER_PORT     --webui-port $(( $SPARK_WORKER_WEBUI_PORT + $i ))  donefi

Comments:
# $ Spark_worker_instances specify the number of worker processes running on a single machine
# Specific process: each worker instance executes sbin/slaves. sh, the execution parameter for this script is to execute "sbin/start-slave.sh", and each slave (worker) node has its own web UI port specified

*) Sbin/slaves. Sh

. "$SPARK_PREFIX/bin/load-spark-env.sh"if [ "$HOSTLIST" = "" ]; then  if [ "$SPARK_SLAVES" = "" ]; then    export HOSTLIST="${SPARK_CONF_DIR}/slaves"  else    export HOSTLIST="${SPARK_SLAVES}"  fifi# By default disable strict host key checkingif [ "$SPARK_SSH_OPTS" = "" ]; then  SPARK_SSH_OPTS="-o StrictHostKeyChecking=no"fifor slave in `cat "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do  ssh $SPARK_SSH_OPTS $slave $"${@// /\\ }"     2>&1 | sed "s/^/$slave: /" &  if [ "$SPARK_SLAVE_SLEEP" != "" ]; then    sleep $SPARK_SLAVE_SLEEP  fidone

Comments:
# The sbin/slaves. Sh script loads the conf/slaves file (configure the slaves node). For details, see the previous article.
# Execute simultaneously for each slave Node
# Sbin/start-slave.sh $ ($ I + 1) spark: // $ spark_master_ip: $ spark_master_port \
# -- Webui-port $ ($ spark_worker_webui_port + $ I ))

*) Sbin/start-slave.sh script Parsing

"$sbin"/spark-daemon.sh start org.apache.spark.deploy.worker.Worker "[email protected]"

Comments:
# Run org. Apache. Spark. Deploy. Worker. worker with spark-daemon.sh

*) Sbin/spark-daemon.sh script analysis
Finally, the spark-daemon.sh script is implemented by using bin/spark-class, and JVM parameters are set.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.