If needing to denormalize a nested JSONB structure into a flat table, can it be realistically done in a single SELECT
if I'm mixing key/values and arrays like the following? I have table containing two columns - repository
and data
and an example data
column value is like the following:
{
"branches": {
"master": {
"type1": {
"files": [{
"filename": "a.txt",
"contents": [{
"identifier": "foo",
"value": "bar"
}]
}]
}
}
}
}
My goal is to have a query that flattens this structure e.g. into the following CSV pseudocode:
repository,branch,type,filename,identifier,value
a/b,master,type1,a.txt,foo,bar
I have kept the JSON example as simple as possible, but would ultimately end up with around a million lines.
Although the need for a single data
JSON column is fixed, the format of it can be refactored if it would make the denormalization easier or more efficient. e.g. instead of branches and types being keyed off their unique identifiers, these could be converted into arrays with identifiers instead.
repository
come from in the output? Btw: you are actually normalizing the de-normalized data you have. – a_horse_with_no_name{
braces doesn't match the number of}
braces. And thecontents
array isn't closed properly. Iscontents
really an array? If yes what if there are multiple identifiers and values in there? Should they be shown as rows or columns in the output? – a_horse_with_no_name