Run Test Case on Spark
今日はSparkのユニットテストの方法を聞いた友达がいます.Sbtのテスト方法を以下のように書きます.
Sparkのtest caseをテストするときはsbtのtestコマンドを使用できます.
sbt/sbt test
sbt/sbt "test-only *DriverSuite*"
次に例を示します.
このTest Caseは$SPARK_にありますHOME/core/src/test/scala/org/apache/spark/DriverSuite.scala
FunSuitはscalatestのテストSuitで、それを継承します.ここでは主に回帰テストで、Sparkプログラムが正常に終了した後、Driverが正常に終了するかどうかをテストします.
注:私はこの例をシミュレーションして、テストの成功とテストの失敗の情景、この例とDriverSuiteのテストの目的は完全に一致しなくて、ただプレゼンテーションの作用だけです.:)
次の例は、通常の運転終了の例です.
executeAndGetOutputメソッドはcommandコマンドを受け入れ、spark-classを呼び出してDriverWithoutCleanupクラスを実行します.
2番目のコマンドを実行すると、実行結果が表示されます.
sbt/sbt"test-only*DriverSuite*"実行結果:
テストに合格し、Total 1、Failed 0、Errors 0、Passed 1.
ここでtest caseを少し変更してspark jobに異常を投げさせれば、test caseはfailedし、以下のようになります.
では、テストを再実行します.
エラーが見つかります
本文は主にsparkのテスト例をどのように実行するか、すべてのtest caseを実行するか、単一のtest caseのコマンドを実行するかを説明し、1つの例を通じてその正常と失敗の詳細を説明し、具体的な詳細は引き続き模索する必要がある.contributorをしたいなら、この関門を通らなければなりません.
——EOF——
オリジナル文章、転載は明記してください、出典http://blog.csdn.net/oopsoom/article/details/38555173
Sparkのtest caseをテストするときはsbtのtestコマンドを使用できます.
一、全てテストケースをテストする
sbt/sbt test
二、単一test caseをテストする
sbt/sbt "test-only *DriverSuite*"
次に例を示します.
このTest Caseは$SPARK_にありますHOME/core/src/test/scala/org/apache/spark/DriverSuite.scala
FunSuitはscalatestのテストSuitで、それを継承します.ここでは主に回帰テストで、Sparkプログラムが正常に終了した後、Driverが正常に終了するかどうかをテストします.
注:私はこの例をシミュレーションして、テストの成功とテストの失敗の情景、この例とDriverSuiteのテストの目的は完全に一致しなくて、ただプレゼンテーションの作用だけです.:)
次の例は、通常の運転終了の例です.
package org.apache.spark
import java.io.File
import org.apache.log4j.Logger
import org.apache.log4j.Level
import org.scalatest.FunSuite
import org.scalatest.concurrent.Timeouts
import org.scalatest.prop.TableDrivenPropertyChecks._
import org.scalatest.time.SpanSugar._
import org.apache.spark.util.Utils
import scala.language.postfixOps
class DriverSuite extends FunSuite with Timeouts {
test("driver should exit after finishing") {
val sparkHome = sys.env.get("SPARK_HOME").orElse(sys.props.get("spark.home")).get
// Regression test for SPARK-530: "Spark driver process doesn't exit after finishing"
val masters = Table(("master"), ("local"), ("local-cluster[2,1,512]"))
forAll(masters) { (master: String) =>
failAfter(60 seconds) {
Utils.executeAndGetOutput(
Seq("./bin/spark-class", "org.apache.spark.DriverWithoutCleanup", master),
new File(sparkHome),
Map("SPARK_TESTING" -> "1", "SPARK_HOME" -> sparkHome))
}
}
}
}
/**
* Program that creates a Spark driver but doesn't call SparkContext.stop() or
* Sys.exit() after finishing.
*/
object DriverWithoutCleanup {
def main(args: Array[String]) {
Logger.getRootLogger().setLevel(Level.WARN)
val sc = new SparkContext(args(0), "DriverWithoutCleanup")
sc.parallelize(1 to 100, 4).count()
}
}
executeAndGetOutputメソッドはcommandコマンドを受け入れ、spark-classを呼び出してDriverWithoutCleanupクラスを実行します.
/**
* Execute a command and get its output, throwing an exception if it yields a code other than 0.
*/
def executeAndGetOutput(command: Seq[String], workingDir: File = new File("."),
extraEnvironment: Map[String, String] = Map.empty): String = {
val builder = new ProcessBuilder(command: _*)
.directory(workingDir)
val environment = builder.environment()
for ((key, value) <- extraEnvironment) {
environment.put(key, value)
}
val process = builder.start() // spark job
new Thread("read stderr for " + command(0)) {
override def run() {
for (line <- Source.fromInputStream(process.getErrorStream).getLines) {
System.err.println(line)
}
}
}.start()
val output = new StringBuffer
val stdoutThread = new Thread("read stdout for " + command(0)) { // spark job
override def run() {
for (line <- Source.fromInputStream(process.getInputStream).getLines) {
output.append(line)
}
}
}
stdoutThread.start()
val exitCode = process.waitFor()
stdoutThread.join() // Wait for it to finish reading output
if (exitCode != 0) {
throw new SparkException("Process " + command + " exited with code " + exitCode)
}
output.toString // spark job
}
2番目のコマンドを実行すると、実行結果が表示されます.
sbt/sbt"test-only*DriverSuite*"実行結果:
[info] Compiling 1 Scala source to /app/hadoop/spark-1.0.1/core/target/scala-2.10/test-classes...
[info] DriverSuite: // DriverSuit TestSuit
Spark assembly has been built with Hive, including Datanucleus jars on classpath
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/app/hadoop/spark-1.0.1/lib_managed/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/app/hadoop/spark-1.0.1/assembly/target/scala-2.10/spark-assembly-1.0.1-hadoop0.20.2-cdh3u5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/08/14 18:20:15 WARN spark.SparkConf:
SPARK_CLASSPATH was detected (set to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --driver-class-path to augment the driver classpath
- spark.executor.extraClassPath to augment the executor classpath
14/08/14 18:20:15 WARN spark.SparkConf: Setting 'spark.executor.extraClassPath' to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*' as a work-around.
14/08/14 18:20:15 WARN spark.SparkConf: Setting 'spark.driver.extraClassPath' to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*' as a work-around.
Spark assembly has been built with Hive, including Datanucleus jars on classpath
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/app/hadoop/spark-1.0.1/lib_managed/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/app/hadoop/spark-1.0.1/assembly/target/scala-2.10/spark-assembly-1.0.1-hadoop0.20.2-cdh3u5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/08/14 18:20:19 WARN spark.SparkConf:
SPARK_CLASSPATH was detected (set to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --driver-class-path to augment the driver classpath
- spark.executor.extraClassPath to augment the executor classpath
14/08/14 18:20:19 WARN spark.SparkConf: Setting 'spark.executor.extraClassPath' to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*' as a work-around.
14/08/14 18:20:19 WARN spark.SparkConf: Setting 'spark.driver.extraClassPath' to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*' as a work-around.
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Spark assembly has been built with Hive, including Datanucleus jars on classpath
[info] - driver should exit after finishing
[info] ScalaTest
[info] Run completed in 12 seconds, 586 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 76 s, completed Aug 14, 2014 6:20:26 PM
テストに合格し、Total 1、Failed 0、Errors 0、Passed 1.
ここでtest caseを少し変更してspark jobに異常を投げさせれば、test caseはfailedし、以下のようになります.
object DriverWithoutCleanup {
def main(args: Array[String]) {
Logger.getRootLogger().setLevel(Level.WARN)
val sc = new SparkContext(args(0), "DriverWithoutCleanup")
sc.parallelize(1 to 100, 4).count()
throw new RuntimeException("OopsOutOfMemory, haha, not real OOM, don't worry!") //
}
では、テストを再実行します.
エラーが見つかります
[info] DriverSuite:
Spark assembly has been built with Hive, including Datanucleus jars on classpath
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/app/hadoop/spark-1.0.1/lib_managed/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/app/hadoop/spark-1.0.1/assembly/target/scala-2.10/spark-assembly-1.0.1-hadoop0.20.2-cdh3u5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/08/14 18:40:07 WARN spark.SparkConf:
SPARK_CLASSPATH was detected (set to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --driver-class-path to augment the driver classpath
- spark.executor.extraClassPath to augment the executor classpath
14/08/14 18:40:07 WARN spark.SparkConf: Setting 'spark.executor.extraClassPath' to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*' as a work-around.
14/08/14 18:40:07 WARN spark.SparkConf: Setting 'spark.driver.extraClassPath' to '/home/hadoop/src/hadoop/lib/:/app/hadoop/sparklib/*:/app/hadoop/spark-1.0.1/lib_managed/jars/*' as a work-around.
Exception in thread "main" java.lang.RuntimeException: OopsOutOfMemory, haha, not real OOM, don't worry! // spark job , ,
at org.apache.spark.DriverWithoutCleanup$.main(DriverSuite.scala:60)
at org.apache.spark.DriverWithoutCleanup.main(DriverSuite.scala)
[info] - driver should exit after finishing *** FAILED ***
[info] SparkException was thrown during property evaluation. (DriverSuite.scala:40)
[info] Message: Process List(./bin/spark-class, org.apache.spark.DriverWithoutCleanup, local) exited with code 1
[info] Occurred at table row 0 (zero based, not counting headings), which had values (
[info] master = local
[info] )
[info] ScalaTest
[info] Run completed in 4 seconds, 765 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0
[info] *** 1 TEST FAILED ***
[error] Failed: Total 1, Failed 1, Errors 0, Passed 0
[error] Failed tests:
[error] org.apache.spark.DriverSuite
[error] (core/test:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 14 s, completed Aug 14, 2014 6:40:10 PM
TEST FAILEDが見えます.三、まとめ:
本文は主にsparkのテスト例をどのように実行するか、すべてのtest caseを実行するか、単一のtest caseのコマンドを実行するかを説明し、1つの例を通じてその正常と失敗の詳細を説明し、具体的な詳細は引き続き模索する必要がある.contributorをしたいなら、この関門を通らなければなりません.
——EOF——
オリジナル文章、転載は明記してください、出典http://blog.csdn.net/oopsoom/article/details/38555173