In Scala, (0 to 20).map(_ => (_, 0))
would not compile, as it has invalid placeholder
syntax. I believe you might be looking for something like below instead:
val data = (0 to 20).map( _->0 )
which would generate a list of key-value pairs, and is really just a placeholder shorthand for:
val data = (0 to 20).map( n => n->0 )
// data: scala.collection.immutable.IndexedSeq[(Int, Int)] = Vector(
// (0,0), (1,0), (2,0), (3,0), (4,0), (5,0), (6,0), (7,0), (8,0), (9,0), (10,0),
// (11,0), (12,0), (13,0), (14,0), (15,0), (16,0), (17,0), (18,0), (19,0), (20,0)
// )
A RDD is an immtable collection (e.g. Seq
, Array
) of data. To create a RDD of Map[Int,Int]
, you would expand data
inside a Map
which in turn gets placed inside a Seq
collection:
val rdd = sc.parallelize(Seq(Map(data: _*)))
rdd.collect
// res1: Array[scala.collection.immutable.Map[Int,Int]] = Array(
// Map(0 -> 0, 5 -> 0, 10 -> 0, 14 -> 0, 20 -> 0, 1 -> 0, 6 -> 0, 9 -> 0, 13 -> 0, 2 -> 0, 17 -> 0,
// 12 -> 0, 7 -> 0, 3 -> 0, 18 -> 0, 16 -> 0, 11 -> 0, 8 -> 0, 19 -> 0, 4 -> 0, 15 -> 0)
// )
Note that, as is, this RDD consists of only a single Map
, and certainly you can assemble as many Map
s as you wish in a RDD.
val rdd2 = sc.parallelize(Seq(
Map((0 to 4).map( _->0 ): _*),
Map((5 to 9).map( _->0 ): _*),
Map((10 to 14).map( _->0 ): _*),
Map((15 to 19).map( _->0 ): _*)
))