I'm learning spark/scala after just writing some MapReduce jobs.
I wrote some java beans to help me parse a file in HDFS and I want to use that to help speed up my progress in spark.
I've had success loading my file and creating an array of my java bean objects:
val input = sc.textFile("hdfs://host:port/user/test/path/out")
import my.package.Record
val clust_recs = clv_input.map(line => new my.package.Record(line))
clust_recs.map(rec => rec.getPremium()).stats()
But the last line creates this error:
<console>:46: error: could not find implicit value for parameter num: Numeric[Double]
I've tested that the values in this field are all valid, so I am pretty sure I don't have any null values that could be causing this error.
Here is an example of values:
val dblArray = clust_recs.map(rec => rec.getPremium()).filter(!isNaN(_))
dblArray.take(10)
OUTPUT:
res82: Array[Double] = Array(1250.6, 433.72, 567.07, 219.24, 310.32, 2173.48, 195.0, 697.94, 711.46, 42.718050000000005)
I'm at a loss to how to resolve this error and wondering if I should just abandon the concept of using a JavaBean object that I've already created.
my.package.Record
a case class? – Ramesh MaharjanRecord.getPremium()
? – Jacek Laskowski