I am using the COPY command to load the JSON dataset from S3 to Redshift table. The data is getting loaded partially but it ignores records which has missing data(key-value/array) i.e. from the below example only the first record will get loaded.
Query:
COPY address from 's3://mybucket/address.json'
credentials 'aws_access_key_id=XXXXXXX;aws_secret_access_key=XXXXXXX' maxerror as 250
json 's3:/mybucket/address_jsonpath.json';
My question is how can I load all the records from address.json even when some records will have missing key/data, similar to the below sample data set.
Sample of JSON
{
"name": "Sam P",
"addresses": [
{
"zip": "12345",
"city": "Silver Spring",
"street_address": "2960 Silver Ave",
"state": "MD"
},
{
"zip": "99999",
"city": "Curry",
"street_address": "2960 Silver Ave",
"state": "PA"
}
]
}
{
"name": "Sam Q",
"addresses": [ ]
}
{
"name": "Sam R"
}
Is there an alternative to FILLRECORD for JSON dataset?
I am looking for an implementation or a workaround which can load all the above 3 records in the Redshift table.
address_jsonpath.json
file content is really important here to provide an answer. – Red Boy