0
votes

I am trying to convert a JSON File into CSV but I don't seem to have any luck in doing so. My JSON looks something like that: ... {

{"meta": {
    "contentType": "Response"
},
"content": {
    "data": {
        "_type": "ObjectList",
        "erpDataObjects": [
            {
                "meta": {
                    "lastModified": "2020-08-10T08:37:21.000+0000",
                },
                "head": {
                    "fields": {
                        "number": {
                            "value": "1",
                        },
                        "id": {
                            "value": "10000"
                        },
                    }
                }
            {
                "meta": {
                    "lastModified": "2020-08-10T08:37:21.000+0000",
                },
                "head": {
                    "fields": {
                        "number": {
                            "value": "2",
                        },
                        "id": {
                            "value": "10001"
                        },
                    }
                }
            {
                "meta": {
                    "lastModified": "2020-08-10T08:37:21.000+0000",
                },
                "head": {
                .. much more data

I basically want my csv to look like this:

number,id
1,10000
2,10001

My flow looks like this:

GetFile -> Set the output-file name -> ConvertRecord -> UpdateAttribute -> PutFile

ConvertRecord uses the JsonTreeReader and a CSVRecordSetWriter JsonTreeReader CsvRecordSetWriter. They both call on an AvroSchemaRegistry which looks like this: AvroSchemaRegistry The AvroSchema itself looks like this:

{
   "type": "record",
   "name": "head",
   "fields":
   [
      {"name": "number", "type": ["string"]},
      {"name": "id", "type": ["string"]},
   ]
}

But I only get this output:

number,id
,

Which makes sense because I'm not specifically indicating where those values are located. I used the JsonPathReader instead before but it only looked like this: JsonPathReader Which obvioulsy only gave me one record. I'm not really sure how I can configure either of the two to output exactly what I want. Help would be much appreciated!

1

1 Answers

0
votes

Using ConvertRecord for JSON -> CSV is mostly intended for "flat" JSON files where each field in the object becomes a column in the outgoing CSV file. For nested/complex structures, consider JoltConvertRecord, it allows you to do more complex transformations. Your example doesn't appear to be valid JSON as-is, but assuming you have something like this as input:

{
  "meta": {
    "contentType": "Response"
  },
  "content": {
    "data": {
      "_type": "ObjectList",
      "erpDataObjects": [
        {
          "meta": {
            "lastModified": "2020-08-10T08:37:21.000+0000"
          },
          "head": {
            "fields": {
              "number": {
                "value": "1"
              },
              "id": {
                "value": "10000"
              }
            }
          }
        },
        {
          "meta": {
            "lastModified": "2020-08-10T08:37:21.000+0000"
          },
          "head": {
            "fields": {
              "number": {
                "value": "2"
              },
              "id": {
                "value": "10001"
              }
            }
          }
        }
      ]
    }
  }
}

The following JOLT spec should give you what you want for output:

[
  {
    "operation": "shift",
    "spec": {
      "content": {
        "data": {
          "erpDataObjects": {
            "*": {
              "head": {
                "fields": {
                  "number": {
                    "value": "[&4].number"
                  },
                  "id": {
                    "value": "[&4].id"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
]