What I'm doing:
- Trying to connect Spark and Cassandra to retrieve data stored at cassandra tables from spark.
What steps have I followed:
- Download cassandra 2.1.12 and spark 1.4.1.
- Built spark with
sudo build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean packagandsbt/sbt clean assembly - Stored some data into cassandra.
- Downloaded these jars into
spark/lib:
cassandra-driver-core2.1.1.jar and spark-cassandra-connector_2.11-1.4.1.jar
Added the jar file paths to conf/spark-defaults.conf like
spark.driver.extraClassPath \
~/path/to/spark-cassandra-connector_2.11-1.4.1.jar:\
~/path/to/cassandra-driver-core-2.1.1.jar
How am I running the shell:
After running ./bin/cassandra, I run spark like-
sudo ./bin/pyspark
and also tried with sudo ./bin/spark-shell
What query am I making
sqlContext.read.format("org.apache.spark.sql.cassandra")\
.options(table="users", keyspace="test")\
.load()\
.show()
The problem:
java.lang.NoSuchMethodError:\
scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
But org.apache.spark.sql.cassandra is present in the spark-cassandra-connecter.jar that I downloaded.
Here is the full Log Trace
What have I tried:
- I tried running with the option
--packagesand--driver-class-pathand--jarsoptions by adding the 2 jars. - Tried downgrading scala to 2.1 and tried with the scala shell but still the same error.
Questions I've been thinking about-
- Are the versions of cassandra, spark and scala that I'm using compatible with each other?
- Am I using the correct version of the jar files?
- Did I compile spark in the wrong way?
- Am I missing something or doing something wrong?
I'm really new to spark and cassandra so I really need some advice! Been spending hours on this and probably it's something trivial.