0
votes

I am processing my cassandra data in spark . I am using "spark-cassandra-connector " that gets data from cassandra into an RDD .

I wan to use spark 2.* s datasets , as datasets will improve my performance .Any Idea how can I do that ?

Any code snippet will be a great help

1

1 Answers

3
votes

Use

spark.read.format("org.apache.spark.sql.cassandra")
   .options(Map("keyspace" -> "your_keyspake", "table" -> "your_table"))
   .load.filter(conditions)

You don't have to convert from rdd to dataset.