
In my application I have a need to create a single-row DataFrame from a Map.

So that a Map like

("col1" -> 5, "col2" -> 10, "col3" -> 6)

would be transformed into a DataFrame with a single row and the map keys would become names of columns.

col1 | col2 | col3
5    | 10   | 6

In case you are wondering why would I want this - I just need to save a single document with some statistics into MongoDB using MongoSpark connector which allows saving DFs and RDDs.

What happens when you try to parallelize it in Spark?OneCricketeer
Are the keys ordered, or do you want to sort them alphabetically?Andrey Tyukin
@AndreyTyukin No, order doesn't matterDaniil Andreyevich Baunov
@cricket_007, I think parallelize doesn't work for MapsDaniil Andreyevich Baunov

3 Answers


I thought that sorting the column names doesn't hurt anyway.

  import org.apache.spark.sql.types._
  val map = Map("col1" -> 5, "col2" -> 6, "col3" -> 10)
  val (keys, values) = map.toList.sortBy(_._1).unzip
  val rows = spark.sparkContext.parallelize(Seq(Row(values: _*)))
  val schema = StructType(keys.map(
    k => StructField(k, IntegerType, nullable = false)))
  val df = spark.createDataFrame(rows, schema)


|   5|   6|  10|

The idea is straightforward: convert map to list of tuples, unzip, convert the keys into a schema and the values into a single-entry row RDD, build dataframe from the two pieces (the interface for createDataFrame is a bit strange there, accepts java.util.Lists and kitchen sinks, but doesn't accept the usual scala List for some reason).


here you go :

val map: Map[String, Int] = Map("col1" -> 5, "col2" -> 6, "col3" -> 10)

val df = map.tail
  .foldLeft(Seq(map.head._2).toDF(map.head._1))((acc,curr) => acc.withColumn(curr._1,lit(curr._2)))


|   5|   6|  10|

A slight variation to Rapheal's answer. You can create a dummy column DF (1*1), then add the map elements using foldLeft and then finally delete the dummy column. That way, your foldLeft is straight forward and easy to remember.

val map: Map[String, Int] = Map("col1" -> 5, "col2" -> 6, "col3" -> 10)

val f = Seq("1").toDF("dummy")

map.keys.toList.sorted.foldLeft(f) { (acc,x) => acc.withColumn(x,lit(map(x)) ) }.drop("dummy").show(false)

|5   |6   |10  |