Creating Schema of JSON type and Reading it using Spark in Scala [Error : cannot resolve jsontostructs]

Question

I have a JSON file like below :

{"Codes":[{"CName":"012","CValue":"XYZ1234","CLevel":"0","msg":"","CType":"event"},{"CName":"013","CValue":"ABC1234","CLevel":"1","msg":"","CType":"event"}}

I wanted to create the schema for this and if the JSON file is empty({}) it should be an empty String.

However, df Output is below when I used df.show:

[[012, XYZ1234, 0, event, ], [013, ABC1234, 1, event, ]]

I created Schema like below :

val schemaF = ArrayType(
  StructType(
    Array(
      StructField("CName", StringType),
      StructField("CValue", StringType),
      StructField("CLevel", StringType),
      StructField("msg", StringType),
      StructField("CType", StringType)
    )
  )
)

When I tried below,

val df1 = df.withColumn("Codes",from_json('Codes, schemaF))

It gives AnalysisException :

org.apache.spark.sql.AnalysisException: cannot resolve 'jsontostructs(Codes)' due to data type mismatch: argument 1 requires string type, however, 'Codes' is of array<structCName:string,CValue:string,CLevel:string,CType:string,msg:string> type.;; 'Project [valid#51, jsontostructs(ArrayType(StructType(StructField(CName,StringType,true), StructField(CValue,StringType,true), StructField(CLevel,StringType,true), StructField(msg,StringType,true), StructField(CType,StringType,true)),true), Codes#8, Some(America/Bogota)) AS errorCodes#77]

Can someone please tell me why and how to resolve this issue?

Codes column is already of type array of struct, why do you want to use from_json? — blackbishop
I see that your Json file is not well defined, where is the closing ] for your your array — itIsNaz
Because if the codes is empty(i.e { Codes : [] }), I want to make use of Schema @blackbishop — YOGESH S

itIsNaz itIsNaz · Accepted Answer · 2021-03-22T19:17:28


val schema =
      StructType(
        Array(
          StructField("CName", StringType),
          StructField("CValue", StringType),
          StructField("CLevel", StringType),
          StructField("msg", StringType),
          StructField("CType", StringType)
        )

      )
    val df0 = spark.read.schema(schema).json("/path/to/data.json")

Creating Schema of JSON type and Reading it using Spark in Scala [Error : cannot resolve jsontostructs]

2 Answers