0
votes

I am trying to create an empty dataFrame in Spark scala and wanted to create the schema of my own and load the record into it.

Below is the example

val emptyDf = spark.emptyDataFrame

val loadEmptyDf = emptyDf.withColumn("col1", lit("yes"), "col2", lit("no"))

but i am not able to get the default value which i am putting while creating schema

Current Output:

|col1|col2| | | |

expected Output:

|col1|col2| |yes | no |

1

1 Answers

0
votes

withColumn add a column , and lit add default value to all row, but your DataFrame is empty ..

val data = Seq(Row("yes", "no"))

val schema: StructType = new StructType()
  .add("col1", StringType)
  .add("col2", StringType)

val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)

df.show()

+----+----+
|col1|col2|
+----+----+
| yes|  no|
+----+----+