I'm new to spark and scala and I've come up with a compile error with scala: Let's say we have a rdd, which is a map like this:
val rawData = someRDD.map{
//some ops
Map(
"A" -> someInt_var1 //Int
"B" -> someInt_var2 //Int
"C" -> somelong_var //Long
)
}
Then, I want to get histogram info of these vars. So, here is my code:
rawData.map{row => row.get("A")}.histogram(10)
And the compile error says:
value histogram is not a member of org.apache.spark.rdd.RDD[Option[Any]]
I'm wondering why rawData.map{row => row.get("A")}
is org.apache.spark.rdd.RDD[Option[Any]]
and how to transform it to rdd[Int]?
I have tried like this:
rawData.map{row => row.get("A")}.map{_.toInt}.histogram(10)
But it compiles fail:
value toInt is not a member of Option[Any]
I'm totally confused and seeking for help here.