1
votes

I have configured spark 2.0 shell to run with datastax cassandra connector.

spark-shell --packages datastax:spark-cassandra-connector:1.5.1-s_2.11

When running this snippet in the shell

sc.stop
import org.apache.spark
import org.apache.spark._
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import com.datastax.spark
import com.datastax.spark._
import com.datastax.spark.connector
import com.datastax.spark.connector._
import com.datastax.spark.connector.cql
import com.datastax.spark.connector.cql._
import com.datastax.spark.connector.cql.CassandraConnector
import com.datastax.spark.connector.cql.CassandraConnector._

val conf = new SparkConf(true).set("spark.cassandra.connection.host", "dbserver")
val sc = new SparkContext("spark://localhost:7077", "test", conf)
val table = sc.cassandraTable("keyspace", "users")
println(table.count)
println(table.first)

On this line

scala> val table = sc.cassandraTable("keyspace", "users")

Getting this error

java.lang.NoClassDefFoundError: com/datastax/spark/connector/cql/CassandraConnector$
at com.datastax.spark.connector.SparkContextFunctions.cassandraTable$default$3(SparkContextFunctions.scala:48)
... 62 elided
3
I think the problem with your approach is, that the cassandra connector can not find some classes on the classpath. You can make a fat jar (for example running 'sbt assembly' with cassandra connector) and then use this local jar in Spark shell.codejitsu

3 Answers

3
votes

As I already said, one option is to build a fat jar with all cassandra connector dependencies within it. You can do it like this:

$ git clone https://github.com/datastax/spark-cassandra-connector.git
$ cd spark-cassandra-connector
$ sbt assembly

And then simply inject the local jar via command line parameter into the spark shell.

1
votes

You probably need to inject a little bit more dependencies or bump its versions up. In my java project I was using these:

com.datastax.spark:spark-cassandra-connector_2.10:1.3.0-M2
com.datastax.spark:spark-cassandra-connector-java_2.10:1.3.0-M2
org.apache.spark:spark-core_2.10:1.3.0
org.apache.spark:spark-streaming_2.10:1.3.0

Try it and let me know.

1
votes

The connector version 1.5 is not compatible with Spark 2.0. Checkout the current master branch or the tag for 2.0.0-m1. The fat jar created by SBT assembly on this branch should work. We should have an official spark packages and maven coordinate for this resource soon.