5
votes

Im trying to join a rdd with Cassandra Table using the Cassandra Spark connector:

samplerdd.joinWithCassandraTable(keyspace, CassandraParams.table)
      .on(SomeColumns(t.date as a.date,
        t.key as a.key)

It works in standalone mode but when I execute in cluster mode I get this error:

Job aborted due to stage failure: Task 6 in stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 (TID 20, 10.10.10.51): java.io.InvalidClassException: com.datastax.spark.connector.rdd.CassandraJoinRDD; local class incompatible: stream classdesc serialVersionUID = 6155891939893411978, local class serialVersionUID = 1245204129865863681

I have already checked the jars in the master and slaves and It seems sames versions.

Im using spark 2.0.0, Cassandra 3.7, Cassandra-Spark Connector 2.0.0 M2, Cassandra Driver Core 3.1.0 and Scala 2.11.8

What could it be happening?

1
You have a dependency version mismatch somewhere. Look for it.maasg
You mention "I have already checked the jars in the master and slaves". You should not place jars on the slaves. Instead use spark-submit --jars <deps...> to submit your job. See spark.apache.org/docs/latest/…maasg
I'm using .master("spark://sparkmaster:7077") instead of Spark-submit. I configure the jars with setJars. Is it right?Manuel Valero
That should work too (that's how I do it too). Check if the slaves have a local copy of the spark.cassandra.connector jar.maasg
It should NOT exist in any slave. Let the driver deal with distributing the right jars.maasg

1 Answers

1
votes

Finally solved. Update cassandra-driver-core dependency to 3.0.0 and works. – Manuel Valero just now edit