0
votes

I'm very new with Spark and Scala, and I'm using spark-shell accesing to Cassandra through datastax open source connector

with this command:

sc.cassandraTable("keyspace","table")
  .select("gender","name")
  .mapValues(v => v.get())
  .lookup("Male")

and get this error:

error: value mapValues is not a member of com.datastax.spark.connector.rdd.CassandraTableScanRDD[com.datastax.spark.connector.CassandraRow]

I don't know if this transformation is only for datastax enterprise; and I am not able to find more information about that.

More details:

  • Java 1.8.0_151
  • Spark 2.2.1
  • Scala 2.11
  • Cassandra 3.11.1
1
mapValues is applicable to RDD[Tuple2[_, _]], not any RDD.Alper t. Turker
but there is an RDD... com.datastax.spark.connector.rdd.CassandraTableScanRDDJuan Antonio Aguilar
It is not if it is RDD, but RDD of what. By example RDD[Int] - no mapValues, RDD[(String, Int)] - mapValues which takes Int => UAlper t. Turker
@JuanAntonioAguilar: What is the expected result from that command?mrsrinivas
Thank you @user8371915, that's it... I don't cast types from Cassandra to Scala types.Juan Antonio Aguilar

1 Answers

0
votes

Ok. I've solved by this way, using the comments in the question:

sc.cassandraTable[(String,String)]("keyspace","table")
  .where("gender = 'Male'")
  .select("gender","name")
  .map{case (k,v) => (v,1)}
  .reduceByKey{case (v,count) => count + count}
  .collect.foreach(println)

The key of the solution is the type conversion cassandraTable[(String,String)] between Cassandra Row and Scala types in Spark.

Thank you.