I have an sbt project that I am trying to build into a jar with the sbt-assembly plugin.
build.sbt:
name := "project-name"
version := "0.1"
scalaVersion := "2.11.12"
val sparkVersion = "2.4.0"
libraryDependencies ++= Seq(
"org.scalatest" %% "scalatest" % "3.0.5" % "test",
"org.apache.spark" %% "spark-core" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"com.holdenkarau" %% "spark-testing-base" % "2.3.1_0.10.0" % "test",
// spark-hive dependencies for DataFrameSuiteBase. https://github.com/holdenk/spark-testing-base/issues/143
"org.apache.spark" %% "spark-hive" % sparkVersion % "provided",
"com.amazonaws" % "aws-java-sdk" % "1.11.513" % "provided",
"com.amazonaws" % "aws-java-sdk-sqs" % "1.11.513" % "provided",
"com.amazonaws" % "aws-java-sdk-s3" % "1.11.513" % "provided",
//"org.apache.hadoop" % "hadoop-aws" % "3.1.1"
"org.json" % "json" % "20180813"
)
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
test in assembly := {}
// https://github.com/holdenk/spark-testing-base
fork in Test := true
javaOptions ++= Seq("-Xms512M", "-Xmx2048M", "-XX:MaxPermSize=2048M", "-XX:+CMSClassUnloadingEnabled")
parallelExecution in Test := false
When I build the project with sbt assembly, the resulting jar contains /org/junit/... and /org/opentest4j/... files
Is there any way to not include these test related files in the final jar?
I have tried replacing the line:
"org.scalatest" %% "scalatest" % "3.0.5" % "test"
with:
"org.scalatest" %% "scalatest" % "3.0.5" % "provided"
I am also wondering how the files are included in the jar as junit is not referenced inside build.sbt (there are junit tests in the project however)?
Updated:
name := "project-name"
version := "0.1"
scalaVersion := "2.11.12"
val sparkVersion = "2.4.0"
val excludeJUnitBinding = ExclusionRule(organization = "junit")
libraryDependencies ++= Seq(
// Provided
"org.apache.spark" %% "spark-core" % sparkVersion % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"com.holdenkarau" %% "spark-testing-base" % "2.3.1_0.10.0" % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-hive" % sparkVersion % "provided",
"com.amazonaws" % "aws-java-sdk" % "1.11.513" % "provided",
"com.amazonaws" % "aws-java-sdk-sqs" % "1.11.513" % "provided",
"com.amazonaws" % "aws-java-sdk-s3" % "1.11.513" % "provided",
// Test
"org.scalatest" %% "scalatest" % "3.0.5" % "test",
// Necessary
"org.json" % "json" % "20180813"
)
excludeDependencies += excludeJUnitBinding
// https://stackguides.com/questions/25144484/sbt-assembly-deduplication-found-error
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
// https://github.com/holdenk/spark-testing-base
fork in Test := true
javaOptions ++= Seq("-Xms512M", "-Xmx2048M", "-XX:MaxPermSize=2048M", "-XX:+CMSClassUnloadingEnabled")
parallelExecution in Test := false