0
votes

I am using Ubuntu and I am trying to connect spark with Cassandra I used the following steps.

git clone https://github.com/datastax/spark-cassandra-connector.git
cd spark-cassandra-connector
./sbt/sbt assembly
./spark-shell --jars ~/spark/jars/spark-cassandra-connector-assembly-1.4.0-SNAPSHOT.jar

And After this I tried this

Scala> sc.stop
Scala> import com.datastax.spark.connector._
Scala> org.apache.spark.SparkContext
Scala> import org.apache.spark.SparkContext._
Scala import org.apache.spark.SparkConf
Scala> val conf = new SparkConf(true).set("spark.cassandra.connection.host", "localhost")
Scala> val sc = new SparkContext(conf)
Scala> val test_spark_rdd = sc.cassandraTable("keyspace", "table") 

I am using spark 2.2.1 and my Cassandra is apache-cassandra-2.2.12

When I enter this command

Scala> val test_spark_rdd = sc.cassandraTable("keyspace", "table") 

it gives me this error.

error: missing or invalid dependency detected while loading class file 'CassandraConnector.class'. Could not access type Logging in package org apache spark, because it (or its dependencies) are missing. Check your build definition for missing or conflicting dependencies. (Re-run with Ylog classpath to see the problematic classpath.) A full rebuild may help if 'CassandraConnector class' was compiled against an incompatible version of org apache spark.

I Find different tutorial but I am not able to solve my issue, is someone will give me suggestion. Thanks

1
connector version is 1.4? for spark 2.x it should be spark-cassandra-connector 2.0undefined_variable
Would you tell me how to install 2.xReal tiger
Usually sbt or maven is used for dependency managementundefined_variable

1 Answers

0
votes

Don't download jar files and try to use them. instead just point the spark shell to the maven dependency.

./bin/spark-shell --packages "com.datastax.spark:spark-cassandra-connector:2.0.7"

Now spark shell will automatically download the right jar file from maven central