25
votes

At the Spark 2.1 docs it's mentioned that

Spark runs on Java 7+, Python 2.6+/3.4+ and R 3.1+. For the Scala API, Spark 2.1.0 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x).

at the Scala 2.12 release news it's also mentioned that:

Although Scala 2.11 and 2.12 are mostly source compatible to facilitate cross-building, they are not binary compatible. This allows us to keep improving the Scala compiler and standard library.

But when I build an uber jar (using Scala 2.12) and run it on Spark 2.1. every thing work just fine.

and I know its not any official source but at the 47 degree blog they mentioned that Spark 2.1 does support Scala 2.12.

How can one explain those (conflicts?) pieces of information ?

3
There is a formal difference i.e. "we support that version, we have tested it, and if you have issues then it's a bug on our side" vs. "do it your way, experiment if you wish, but if you have issues then don't come back whining".Samson Scharfrichter
yea but how can it work if scala 2.11 is not binary compatible with 2.12?NetanelRabinowitz
Not compatible means that there is at least 1 issue. Could be OK for 99.99% of the API calls. How much did you test with your custom Uber-JAR? Maybe 15%?Samson Scharfrichter

3 Answers

35
votes

Spark does not support Scala 2.12. You can follow SPARK-14220 (Build and test Spark against Scala 2.12) to get up to date status.

update: Spark 2.4 added an experimental Scala 2.12 support.

2
votes

Scala 2.12 is officially supported (and required) as of Spark 3. Summary:

  • Spark 2.0 - 2.3: Required Scala 2.11
  • Spark 2.4: Supported Scala 2.11 and Scala 2.12, but not really cause almost all runtimes only supported Scala 2.11.
  • Spark 3: Only Scala 2.12 is supported

Using a Spark runtime that's compiled with one Scala version and a JAR file that's compiled with another Scala version is dangerous and causes strange bugs. For example, as noted here, using a Scala 2.11 compiled JAR on a Spark 3 cluster will cause this error: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps.

Look at all the poor Spark users running into this very error.

Make sure to look into Scala cross compilation and understand the %% operator in SBT to limit your suffering. Maintaining Scala projects is hard and minimizing your dependencies is recommended.

0
votes

To add to the answer, I believe it is a typo https://spark.apache.org/releases/spark-release-2-0-0.html has no mention of scala 2.12.

Also, if we look at timings Scala 2.12 was not released untill November 2016 and Spark 2.0.0 was released on July 2016.

References: https://spark.apache.org/news/index.html

www.scala-lang.org/news/2.12.0/