I have an RDD in spark which is essentially (timestamp, id), where the timestamp is joda DateTime of the form yyyy/MM/dd HH:mm. The RDD is of class;
case class myRDD(timestamp: org.joda.time.DateTime, id: String)
I am using Spark and Scala.
I want to filter the data to only have a certain day i.e. 2000/01/01, and return something of the form (timestamp, id), but am unsure how to use filter() with the joda timestamp. I have created the start and end of the interval I want to filter by the following;
val start = myFormat.parseDateTime("2000/01/01 00:00")
val end = myFormat.parseDateTime("2000/01/02 00:00”)
but I do not know how to apply this to an RDD, or even if this is the best way to approach this. Any tips would be greatly appreciated.
case class rdd(timestamp: org.joda.time.DateTime, id: String)- user7810705