1
votes

I'm trying to run a simple test spark job. When I add spark-cassandra connector (either v. 1.2.0 or v. 1.2.1) the job fails.

Here is my build file:

name := "spark test"

version := "1.0"

scalaVersion := "2.10.4"

resolvers += "Typesafe Repo" at "http://repo.typesafe.com/typesafe/releases"

libraryDependencies ++= Seq(
                "org.apache.spark" %% "spark-core" % "1.2.1",
                "com.datastax.spark" %% "spark-cassandra-connector" % "1.2.1")

And here is the source code:

package com.xxx.test

import com.datastax.spark.connector._
import org.apache.spark.{SparkConf, SparkContext}



object Test {

   def main( args: Array[String] ) {

          val conf = new SparkConf()
               .set("spark.executor.home", "/home/ubuntu/spark-1.2.1-bin-hadoop2.4")
               .setMaster("local[*]")
//                 .setMaster("spark://10.14.56.139:7077")
               .setAppName("Test")


          val sc = new SparkContext( conf )

          val numbers = sc.parallelize( 1 to 100 )
          numbers.map( _.toDouble ).count

    }

}

As you can see I'm not really using the connector. I want to do it though, but when I did it threw and error and I'm trying to isolate it to see where it is coming from. Now when I change the connector version to 1.2.1 the same error occurs (see below) not when I use 1.2.0-rc3 or when I remove the dependency (and the import) altogether. Since the connector's github page suggests using 1.2.1, I would like to do that. And here is the error that I'm getting:

15/05/20 09:41:47 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.io.IOException: java.lang.ClassNotFoundException: scala.collection.immutable.Range
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1078)

When I'm running it on a cluster (setMaster("spark://10.14.56.139:7077")) I'm getting a different error, but still a fatal one:

15/05/20 10:18:55 ERROR TaskResultGetter: Exception while getting task result
java.io.IOException: java.lang.ClassNotFoundException: scala.None$
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1078)

I would like to use the recommended 1.2.1 version in our production environment so if you could please help me figure out what's going on, it would be great.

I'm using sbt v. 0.13.8 and ubuntu 14.04

1
Can you pust your build output? Also are you using an assembly jar? - Holden

1 Answers

2
votes

At first look it seems that your jar doesn't have some core scala libraries in it. I'd recommend building an assembly jar (and you can mark spark-core as provided if submitting with the spark-submit script to an existing cluster).