I am using spark-cassandra-connector to connect to cassandra from spark.
I am able to connect through Livy successfully using the below command.
curl -X POST --data '{"file": "/my/path/test.py", "conf" : {"spark.jars.packages": "com.datastax.spark:spark-cassandra-connector_2.11:2.3.0", "spark.cassandra.connection.host":"myip"}}' -H "Content-Type: application/json" localhost:8998/batches
Also able to connect through pyspark shell interactively using below command
sudo pyspark --packages com.datastax.spark:spark-cassandra-connector_2.10:2.0.10 --conf spark.cassandra.connection.host=myip
However not able to connect through spark-submit. some of the commands I have tried for the same are below.
spark-submit test.py --packages com.datastax.spark:spark-cassandra-connector_2.11:2.3.2 --conf spark.cassandra.connection.host=myip
this one didnt work.
I tried passing these parameters my python files used for spark-submit, still didnt work.
conf = (SparkConf().setAppName("Spark-Cassandracube").set("spark.cassandra.connection.host","myip").set({"spark.jars.packages","com.datastax.spark:spark-cassandra-connector_2.11:2.3.0"))
sc = SparkContext(conf = conf)
sqlContext = SQLContext(sc)
tried passing these parameters uisng jupyter notebook was also.
import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages com.datastax.spark:spark-cassandra-connector_2.11:2.3.0 --conf spark.cassandra.connection.host="myip" pyspark-shell'
All the threads that i have seen so far are talking about spark-cassandra-connector using spark-shell but nothing much about spark-submit.
Version used
Livy : 0.5.0 Spark : 2.4.0 Cassandra : 3.11.4