I am using Spark/Scala and I want to fill the nulls in my DataFrame with default values based on the type of the columns.
i.e String Columns -> "string", Numeric Columns -> 111, Boolean Columns -> False etc.
Currently the DF.na.functions API provides na.fillfill(valueMap: Map[String, Any]) like
df.na.fill(Map(
"A" -> "unknown",
"B" -> 1.0
))
This requires knowing the column names and also the type of the columns.
OR
fill(value: String, cols: Seq[String])
This is only String/Double types, not even Boolean.
Is there a smart way to do this?
isInstanceOfto check the incoming data type and replace with proper value. - ShankarSpark v2.2.1supports only a limited number of datatypes forDataFrame.na.filloperation. Quoting the docs,"value must be of the following type: Int, Long, Float, Double, String, Boolean."- y2k-shubham