Spark on k8s trial steps

Source: Internet
Author: User
Tags throw exception k8s
background:

Spark 2.3.0 began to support the use of k8s as a resource management native dispatch Spark. The use of k8s native scheduling spark mainly has the following advantages: The use of k8s primary scheduling, no longer requires two-level scheduling, direct use of the K8s native scheduling module, to achieve the mixed with other applications; resource isolation: Tasks can be committed to the specified namespace, In this way, the qouta limit of k8s native can be reused to realize the limitation of task resources; Resource allocation: You can specify resource limits for each spark task, and tasks are more isolated; user-defined: Users can play their own application in the spark base mirror, more flexible and convenient ; trial Condition: a k8s 1.7 version of the cluster, because spark on k8s task is actually in the cluster in the form of custom resources and custom controller, so you need a 1.7+ Version of the k8s cluster, with the need to start k8s DNS and RBAC. Download spark2.3.0 version https://www.apache.org/dyn/closer.lua/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz trial steps: to make a mirror:

The following is the base image, which contains the spark and the official exemples, and the trial use of this article is the official exemple.

cd/path/to/spark-2.3.0-bin-hadoop2.7
Docker build-t <your.image.hub/yourns>/spark:2.3.0-f kubernetes/ Dockerfiles/spark/dockerfile.
Docker Push <your.image.hub/yourns>/spark:2.3.0

Users can put their own application and the base mirror together and set the path to start main class and application to implement the user application task submission. Task submission:

Bin/spark-submit \
    --master k8s://<k8s apiserver address> \
    --deploy-mode cluster \
    --name spark-pi \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.executor.instances=5 \
    --conf spark.kubernetes.container.image=<your.image.hub/yourns>/spark:2.3.0 \
    local:///opt/spark/examples/ Jars/spark-examples_2.11-2.3.0.jar

More default parameter configurations refer to: 1.spark running on k8s
Note the following pits: Spark Exemples is compiled with jdk1.8, if prompted during startup unsupported Major.minor Version 52.0 Please replace the JDK version; Spark-submit default will go to ~/.kube/config to load the cluster configuration, so please put K8s cluster config in the directory; Spark driver to start the error error:could Not find or Load main class Org.apache.spark.examples.SparkPi
Spark the local://of the boot parameter should be followed by your own spark application the path in the container; Spark driver Boot Throw exception caused by:java.net.UnknownHostException:kubernetes.default.svc:Try again, please ensure the network interoperability between the k8d let node; Spark D River Boot Throw Exception System:serviceaccount:default:default "Cannot get pods in the namespace" default, permission problem, execute two commands:
Kubect L Create rolebinding Default-view--clusterrole=view--serviceaccount=default:default--namespace=defalut and
KUBECTL Create rolebinding default-admin--clusterrole=admin--serviceaccount=default:default--namespace=default Then you can task execution:

Spark Demo ran up, you can see Spark-submit equivalent to a controller, used to manage a single spark task, will first create the service and driver the task, after driver run, will start Exeuctor, The number of--conf spark.executor.instances=5 specified parameters, after completion, submit will automatically delete Exeuctor, driver will be cleaned with the default GC mechanism. Reference:

Spark running on k8s
Issue #34377

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.