I am applying the following through the Spark Cassandra Connector:
val links = sc.textFile("linksIDs.txt")
links.map( link_id =>
{
val link_speed_records = sc.cassandraTable[Double]("freeway","records").select("speed").where("link_id=?",link_id)
average = link_speed_records.mean().toDouble
})
I would like to ask if there is way to apply the above sequence of queries more efficiently given that the only parameter I always change is the 'link_id'.
The 'link_id' value is the only Partition Key of my Cassandra 'records' table. I am using Cassandra v.2.0.13, Spark v.1.2.1 and Spark-Cassandra Connector v.1.2.1
I was thinking if it is possible to open a Cassandra Session in order to apply those queries and still get the 'link_speed_records' as a SparkRDD.