I have a JSON file with this type of schema:
{
"name" : "john doe",
"phone-numbers" : {
"home": ["1111", "222"],
"country" : "England"
}
}
The home phone numbers array could sometimes be empty.
My spark application receives a list of these JSONS and does this:
val dataframe = spark.read.json(filePaths: _*)
val result = dataframe.select($"name",
explode(dataframe.col("phone-numbers.home")))
When the 'home' array is empty, I receive the following error when I try to explode it:
org.apache.spark.sql.AnalysisException: cannot resolve '
phone-numbers
['home']' due to data type mismatch: argument 2 requires integral type, however, ''home'' is of string type.;;
Is there an elegant way to prevent spark from exploding this field if it's empty or null?