Just trying to clarify something, some low-hanging fruit, a question generated by watching a user in another question trying to call RDD operations on a broadcast variable? That's wrong, right?
Question Is: A Spark broadcast variable is not an RDD, correct? It's a collection in Scala, am I seeing that correctly?
Looking at the Scala docs: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.broadcast.Broadcast
So it has whatever sub-type it's assigned when it's created, the sub-type of whatever is passed to it? Like if this was a Java ArrayList it would be an ArrayList of Integers? So
sc.broadcast([0,1,2]) would create a Broadcast[Array[Int]] in scala-notation?
scala> val broadcastVar = sc.broadcast(Array(1, 2, 3))
broadcastVar: org.apache.spark.broadcast.Broadcast[Array[Int]] = Broadcast(0)
scala> broadcastVar.value
res0: Array[Int] = Array(1, 2, 3)
( I really did search around quite a bit for a clear straighforward answer but it must be too basic of a question, yet so important to understand, thanks.)
Would be nice but not necessary to have some info on what Python does with Broadcasts, I assume it calls the underlying Scala class and it's stored as a Scala Broadcast type underneath the hood?