I have a JSON file like below :
{"Codes":[{"CName":"012","CValue":"XYZ1234","CLevel":"0","msg":"","CType":"event"},{"CName":"013","CValue":"ABC1234","CLevel":"1","msg":"","CType":"event"}}
I wanted to create the schema for this and if the JSON file is empty({}
) it should be an empty String.
However, df Output is below when I used df.show
:
[[012, XYZ1234, 0, event, ], [013, ABC1234, 1, event, ]]
I created Schema like below :
val schemaF = ArrayType(
StructType(
Array(
StructField("CName", StringType),
StructField("CValue", StringType),
StructField("CLevel", StringType),
StructField("msg", StringType),
StructField("CType", StringType)
)
)
)
When I tried below,
val df1 = df.withColumn("Codes",from_json('Codes, schemaF))
It gives AnalysisException :
org.apache.spark.sql.AnalysisException: cannot resolve 'jsontostructs(
Codes
)' due to data type mismatch: argument 1 requires string type, however, 'Codes
' is of array<structCName:string,CValue:string,CLevel:string,CType:string,msg:string> type.;; 'Project [valid#51, jsontostructs(ArrayType(StructType(StructField(CName,StringType,true), StructField(CValue,StringType,true), StructField(CLevel,StringType,true), StructField(msg,StringType,true), StructField(CType,StringType,true)),true), Codes#8, Some(America/Bogota)) AS errorCodes#77]
Can someone please tell me why and how to resolve this issue?
Codes
column is already of type array of struct, why do you want to usefrom_json
? – blackbishop