I am trying to read a csv file into an RDD in Spark (Using Scala). I have made a function to first filter data so that it doesn't take the header into consideration.
def isHeader(line: String): Boolean = {
line.contains("id_1")
}
and then I am running the following command:
val noheader = rawblocks.filter(x => !isHeader(x))
The rawblocks RDD reads data from a csv file which is 26MB in size
I am getting Task not serializable error. What can be the solution?