Run test case on spark

Source: Internet
Author: User

Today, some friends asked how to perform unit tests on spark. Write the SBT test method as follows:

When testing the spark test case, you can use the SBT test command:

1. test all test cases

 SBT/SBT Test


2. Test a single test case

SBT/SBT "test-only * driversuite *"


The following is an example:

This test case is located at $ spark_home/CORE/src/test/Scala/org/Apache/spark/driversuite. Scala

Funsuit is the test suit in scalatest. It must be inherited. Here is mainly a regression test. After the spark program ends normally, the driver will exit normally.

Note: I will use this example to simulate the test success and test failure scenarios. This example is totally different from the test Purpose of driversuite, but it only serves as a demonstration. :)

The following is an example of exit from normal operation:

package org.apache.sparkimport java.io.Fileimport org.apache.log4j.Loggerimport org.apache.log4j.Levelimport org.scalatest.FunSuiteimport org.scalatest.concurrent.Timeoutsimport org.scalatest.prop.TableDrivenPropertyChecks._import org.scalatest.time.SpanSugar._import org.apache.spark.util.Utilsimport scala.language.postfixOpsclass DriverSuite extends FunSuite with Timeouts {  test("driver should exit after finishing") {    val sparkHome = sys.env.get("SPARK_HOME").orElse(sys.props.get("spark.home")).get    // Regression test for SPARK-530: "Spark driver process doesn't exit after finishing"    val masters = Table(("master"), ("local"), ("local-cluster[2,1,512]"))    forAll(masters) { (master: String) =>      failAfter(60 seconds) {        Utils.executeAndGetOutput(          Seq("./bin/spark-class", "org.apache.spark.DriverWithoutCleanup", master),          new File(sparkHome),          Map("SPARK_TESTING" -> "1", "SPARK_HOME" -> sparkHome))      }    }  }}/** * Program that creates a Spark driver but doesn't call SparkContext.stop() or * Sys.exit() after finishing. */object DriverWithoutCleanup {  def main(args: Array[String]) {    Logger.getRootLogger().setLevel(Level.WARN)    val sc = new SparkContext(args(0), "DriverWithoutCleanup")    sc.parallelize(1 to 100, 4).count()  }}

The executeandgetoutput method accepts a command and calls spark-class to run the driverwithoutcleanup class.

/*** Execute a command and get its output, throwing an exception if it yields a code other than 0. */DEF executeandgetoutput (command: seq [String], workingdir: file = new file (". "), extraenvironment: Map [String, string] = map. empty): String = {Val builder = new processbuilder (command :_*). directory (workingdir) Val environment = builder. environment () for (Key, value) <-extraenvironment) {envir Onment. put (Key, value)} Val process = builder. start () // start a process to run spark job new thread ("read stderr for" + command (0) {override def run () {for (line <-source. frominputstream (process. geterrorstream ). getlines) {system. err. println (line )}}}. start () Val output = new stringbuffer Val stdoutthread = new thread ("read stdout for" + command (0) {// read the output of spark job override def run () {for (line <- Source. frominputstream (process. getinputstream ). getlines) {output. append (line) }}stdoutthread. start () Val exitcode = process. waitfor () stdoutthread. join () // wait for it to finish reading output if (exitcode! = 0) {Throw new sparkexception ("process" + command + "exited with code" + exitcode)} output. tostring // return spark job output}

Run the second command to view the running result:

SBT/SBT "test-only * driversuite *"

Execution result:

[Info] compiling 1 Scala source to/APP/hadoop/spark-1.0.1/CORE/target/scala-2.10/test-classes... [info] driversuite: // execute the driversuit testsuitspark assembly has been built with hive, including datanucleus jars on classpathslf4j: Class path contains multiple slf4j bindings. slf4j: Found binding in [jar: file:/APP/hadoop/spark-1.0.1/lib_managed/jars/slf4j-log4j12-1.7.5.jar! /Org/slf4j/impl/staticloggerbinder. Class] slf4j: Found binding in [jar: file:/APP/hadoop/spark-1.0.1/ASSEMBLY/target/scala-2.10/spark-assembly-1.0.1-hadoop0.20.2-cdh3u5.jar! /Org/slf4j/impl/staticloggerbinder. class] slf4j: see http://www.slf4j.org/codes.html#multiple_bindings for an explanation. slf4j: actual binding is of Type [Org. slf4j. impl. log4jloggerfactory] 14/08/14 18:20:15 warn spark. sparkconf: spark_classpath was detected (set to '/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars /*'). this is deprecated in Spark 1.0 +. plea Se instead use :-. /spark-submit with -- driver-class-path to augment the driver classpath-spark.exe cutor. extraclasspath to augment the executor classpath 14/08/14 18:20:15 warn spark. sparkconf: Setting 'spark.exe cutor. extraclasspath 'to'/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars/* 'as a work-around.14/08/14 18:20:15 warn spark. sparkconf: Setting 'Spark. driver. extraclasspath 'to'/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars/* 'as a work-around.Spark assembly has been built with hive, including datanucleus jars on classpathslf4j: Class path contains multiple slf4j bindings. slf4j: Found binding in [jar: file:/APP/hadoop/spark-1.0.1/lib_managed/jars/slf4j-log4j12-1.7.5.jar! /Org/slf4j/impl/staticloggerbinder. Class] slf4j: Found binding in [jar: file:/APP/hadoop/spark-1.0.1/ASSEMBLY/target/scala-2.10/spark-assembly-1.0.1-hadoop0.20.2-cdh3u5.jar! /Org/slf4j/impl/staticloggerbinder. class] slf4j: see http://www.slf4j.org/codes.html#multiple_bindings for an explanation. slf4j: actual binding is of Type [Org. slf4j. impl. log4jloggerfactory] 14/08/14 18:20:19 warn spark. sparkconf: spark_classpath was detected (set to '/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars /*'). this is deprecated in Spark 1.0 +. please instead use :-. /spark-submit with -- driver-class-path to augment the driver classpath-spark.exe cutor. extraclasspath to augment the executor classpath 14/08/14 18:20:19 warn spark. sparkconf: Setting 'spark.exe cutor. extraclasspath 'to'/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars/* 'as a work-around.14/08/14 18:20:19 warn spark. sparkconf: Setting 'spark. driver. extraclasspath 'to'/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars/* 'as a work-around.Spark assembly has been built with hive, including datanucleus jars on classpathspark assembly has been built with hive, including datanucleus jars on classpath [info]-driver shold exit after finishing [info] scalatest [info] Run completed in 12 seconds, 586 milliseconds. [info] Total number of tests run: 1 [info] suites: Completed 1, aborted 0 [info] tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. [info] passed: Total 1, failed 0, Errors 0, passed 1 [success] total time: 76 s, completed Aug 14,201 4 6:20:26

Test passed. Total 1, failed 0, Errors 0, and passed 1.

If we change test case slightly and let spark job throw an exception, the test case will be failed as follows:

Object driverwithoutcleanup {def main (ARGs: array [String]) {logger. getrootlogger (). setlevel (level. warn) Val SC = new sparkcontext (ARGs (0), "driverwithoutcleanup") SC. parallelize (1 to 100, 4 ). count () throw new runtimeexception ("oopsoutofmemory, haha, not real Oom, don't worry! ") // Add this row}

Then run the test again:

Will find errors

[Info] driversuite: spark assembly has been built with hive, including datanucleus jars on classpathslf4j: Class path contains multiple slf4j bindings. slf4j: Found binding in [jar: file:/APP/hadoop/spark-1.0.1/lib_managed/jars/slf4j-log4j12-1.7.5.jar! /Org/slf4j/impl/staticloggerbinder. Class] slf4j: Found binding in [jar: file:/APP/hadoop/spark-1.0.1/ASSEMBLY/target/scala-2.10/spark-assembly-1.0.1-hadoop0.20.2-cdh3u5.jar! /Org/slf4j/impl/staticloggerbinder. class] slf4j: see http://www.slf4j.org/codes.html#multiple_bindings for an explanation. slf4j: actual binding is of Type [Org. slf4j. impl. log4jloggerfactory] 14/08/14 18:40:07 warn spark. sparkconf: spark_classpath was detected (set to '/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars /*'). this is deprecated in Spark 1.0 +. plea Se instead use :-. /spark-submit with -- driver-class-path to augment the driver classpath-spark.exe cutor. extraclasspath to augment the executor classpath 14/08/14 18:40:07 warn spark. sparkconf: Setting 'spark.exe cutor. extraclasspath 'to'/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars/* 'as a work-around.14/08/14 18:40:07 warn spark. sparkconf: Setting 'Spark. driver. extraclasspath 'to'/home/hadoop/src/hadoop/lib/:/APP/hadoop/sparklib /*: /APP/hadoop/spark-1.0.1/lib_managed/jars/* 'as a work-around.Exception in thread "Main" Java. lang. runtimeexception: oopsoutofmemory, haha, not real Oom, don't worry! // If a custom throwing exception causes spark job to fail to run, the exception stack is printed, and the test case fails at Org. apache. spark. driverwithoutcleanup $. main (driversuite. scala: 60) at Org. apache. spark. driverwithoutcleanup. main (driversuite. scala) [info]-driver shocould exit after finishing *** failed *** [info] sparkexception was thrown during property evaluation. (driversuite. scala: 40) [info] Message: Process List (. /bin/spark-class, org. apache. spark. driverwithoutcleanup, local) exited with code 1 [info] occurred at table row 0 (zero based, not counting headings), which had values ([info] Master = Local [info]) [info] scalatest [info] Run completed in 4 seconds, 765 milliseconds. [info] Total number of tests run: 1 [info] suites: Completed 1, aborted 0 [info] tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0 [info] *** 1 test failed *** [Error] failed: Total 1, failed 1, Errors 0, passed 0 [Error] failed tests: [Error] Org. apache. spark. driversuite [Error] (Core/test: testonly) SBT. testsfailedexception: Tests unsuccessful [Error] total time: 14 S, completed Aug 14,201 4 6:40:10
You can see test failed.

Iii. Summary: This article mainly explains how to run spark test cases, run all test cases, and run a single test case command, an example is provided to illustrate the normal operation and failure details. If you want to create a contributor, this must be done.

-- EOF --

Original article, reproduced please note, from http://blog.csdn.net/oopsoom/article/details/38555173

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.