0
votes

Getting error while reading ArrayType values(phoneNumbers), without ArrayType values, I can read rest values.

{
    "firstName": "Rack",
    "lastName": "Jackon",
    "gender": "man",
    "age": 24,
    "address": {
        "streetAddress": 126,
        "city": "San Jone",
        "state": "CA",
        "postalCode": 394221
    },
    "phoneNumbers": [
        { "type": "home", "number": 7383627627}
    ]
}

My schema ->
val schema=StructType(List(
      StructField("firstName",StringType),
      StructField("lastName",StringType),
      StructField("gender",StringType),
      StructField("age",IntegerType),
      StructField("address",StructType(List(
        StructField("streetAddress",StringType),
        StructField("city",StringType),
        StructField("state",StringType),
        StructField("postalCode",IntegerType)))),
      StructField("phoneNumbers",ArrayType(StructType(List(
      StructField("type",StringType),
      StructField("number",IntegerType))))),
    ))

json_df.selectExpr("firstName","lastName",
      "gender","age","address.streetAddress","address.city",
      "address.state","address.postalCode",
      "explode(phoneNumbers) as phone","phone.type","phone.number").drop("phone").show()

When I do .show, it shows only column names and no values but when I don't take "phoneNumbers" array, it works fine.

1

1 Answers

1
votes

IntegerType represents 4-byte signed integer numbers and has a maximum of 2147483647, which cannot hold phone numbers. Either use LongType or StringType for phone numbers.

You got no results from your select query because you're exploding an empty array of phone numbers, which returns 0 rows. The array is empty because the phone numbers cannot be saved in an IntegerType column.