I am very new to Cassandra and Spark. Following are the things I have done so far: 1) Installed Cassandra 2.1.8 version added lucene secondary indexes. Added test data. 2) Have pre built Spark 1.4.1 3) I have the Spark Cassandra connector Jars.
I am able to use ./spark-shell --jars /pathy/to/spark-cassandra-connector/spark-cassandra-connector-assembly-1.5.0-M1-SNAPSHOT.jar and
./pyspark --jars /path/to/pyspark_cassandra-0.1.5.jar --driver-class-path /path/to/pyspark_cassandra-0.1.5.jar --py-files /path/to/pyspark_cassandra-0.1.5-py2.6.egg
Using both, I am able to query the cassandra table.
My requirement is as follows -
We have an application on a remote server in Php. This application, with some filters will request for data from the spark cassandra layer.
- What is the best way to serve this request?
- Which is the preferred language, Python or Scala?
- With REST API which scala framework is recommended?
Currently I am just trying out a simple Python script over cgi-bin. The problem is, how do I add connector --jars in the Python script?
I have tried conf.set("spark.jars","/jar/path") which does not work.
Any help would be highly appreciated.
Thanks in Advance