5
votes

Im trying to join a rdd with Cassandra Table using the Cassandra Spark connector:

samplerdd.joinWithCassandraTable(keyspace, CassandraParams.table)
      .on(SomeColumns(t.date as a.date,
        t.key as a.key)

It works in standalone mode but when I execute in cluster mode I get this error:

Job aborted due to stage failure: Task 6 in stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 (TID 20, 10.10.10.51): java.io.InvalidClassException: com.datastax.spark.connector.rdd.CassandraJoinRDD; local class incompatible: stream classdesc serialVersionUID = 6155891939893411978, local class serialVersionUID = 1245204129865863681

I have already checked the jars in the master and slaves and It seems sames versions.

Im using spark 2.0.0, Cassandra 3.7, Cassandra-Spark Connector 2.0.0 M2, Cassandra Driver Core 3.1.0 and Scala 2.11.8

What could it be happening?

1
You have a dependency version mismatch somewhere. Look for it. - maasg
You mention "I have already checked the jars in the master and slaves". You should not place jars on the slaves. Instead use spark-submit --jars <deps...> to submit your job. See spark.apache.org/docs/latest/… - maasg
I'm using .master("spark://sparkmaster:7077") instead of Spark-submit. I configure the jars with setJars. Is it right? - Manuel Valero
That should work too (that's how I do it too). Check if the slaves have a local copy of the spark.cassandra.connector jar. - maasg
It should NOT exist in any slave. Let the driver deal with distributing the right jars. - maasg

1 Answers

1
votes

Finally solved. Update cassandra-driver-core dependency to 3.0.0 and works. – Manuel Valero just now edit