0
votes

Spark version: 1.4.1

Cassandra Version: 2.1.8

Datastax Cassandra Connector: 1.4.2-SNAPSHOT.jar

Command I ran

./spark-submit --jars /usr/local/src/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.4.2-SNAPSHOT.jar --driver-class-path /usr/local/src/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.4.2-SNAPSHOT.jar --jars /usr/local/lib/spark-1.4.1/external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.4.1.jar --jars /usr/local/lib/spark-1.4.1/external/kafka-assembly/target/scala-2.10/spark-streaming-kafka-assembly_2.10-1.4.1.jar --driver-class-path /usr/local/lib/spark-1.4.1/external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.4.1.jar --driver-class-path /usr/local/lib/spark-1.4.1/external/kafka-assembly/target/scala-2.10/spark-streaming-kafka-assembly_2.10-1.4.1.jar --packages org.apache.spark:spark-streaming-kafka_2.10:1.4.1 --executor-memory 6g --executor-cores 6 --master local[4] kafka_streaming.py

Below is the error I am getting:

Py4JJavaError: An error occurred while calling o169.save.
: java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra

Must be doing something silly. Any response will be appreciated.

1

1 Answers

3
votes

Try to provide all your jars in the same --jars option (comma-separated) :

--jars yourFirstJar.jar,yourSecondJar.jar

A more convenient solution for development purpose would be to use the jars from maven central (comma-separated) :

--packages org.apache.spark:spark-streaming-kafka_2.10:1.4.1,com.datastax.spark:spark-cassandra-connector_2.10:1.4.1