Split/Slice large JSON sort free Unique by few columns & add additional element using jq

Question

per Split/Slice large JSON using jq we are able to successfully slice huge input file into smaller chunk of data based on array size..

Would like to add a new json element to it with incrementing sequence number based on length of original array along with filter/unique per few columns.

Input:

{"recDt":"2021-01-05",
 "country":"US",
 "name":"ABC",
 "number":"9828",
 "add": [
     {"evnCd":"O","rngNum":"1","state":"TX","city":"ANDERSON","postal":"77830"},
     {"evnCd":"O","rngNum":"2","state":"TX","city":"ANDERSON","postal":"77830"},
     {"evnCd":"O","rngNum":"3","state":"TX","city":"ANDERSON","postal":"77831"},
     {"evnCd":"O","rngNum":"4","state":"TX","city":"ANDERSON","postal":"77832"}
 ]
}

Expected Output: After performing adding of additional key

{"recDt":"2021-01-05",
 "country":"US",
 "name":"ABC",
 "number":"9828",
 "add": [
     {"rownum":1,"evnCd":"O","rngNum":"1","state":"TX","city":"ANDERSON","postal":"77830"},
     {"rownum":2,"evnCd":"O","rngNum":"2","state":"TX","city":"ANDERSON","postal":"77830"},
     {"rownum":3,"evnCd":"O","rngNum":"3","state":"TX","city":"ANDERSON","postal":"77831"},
     {"rownum":4,"evnCd":"O","rngNum":"4","state":"TX","city":"ANDERSON","postal":"77832"}
 ]
}

After performing filter (by State, City, Postal) and slice per array size of 2

{"recDt":"2021-01-05",
 "country":"US",
 "name":"ABC",
 "number":"9828",
 "add": [
     {"rownum":1,"evnCd":"O","rngNum":"1","state":"TX","city":"ANDERSON","postal":"77830"},
     {"rownum":3,"evnCd":"O","rngNum":"3","state":"TX","city":"ANDERSON","postal":"77831"}]}

{"recDt":"2021-01-05",
 "country":"US",
 "name":"ABC",
 "number":"9828",
 "add": [
     {"rownum":4,"evnCd":"O","rngNum":"4","state":"TX","city":"ANDERSON","postal":"77832"}
 ]
}

Below sample was used to filer/unique by few columns, not attaining optimal performance

input.json jq -r --argjson size 2 ' .add |= unique_by({city,state,postal}) | del(.add) as $object | (.add|_nwise($size) | ("\t", $object + {add:.} )) ' | awk ' /^\t/ {fn++; next} { print >> "part-" fn ".json"}'

Not a free coding service. What have you done and what specifically are the issues you are having. Try yourself first. — dawg
Not looking for any free coding service.. if you have looked at earlier post referenced here, tried code is very much shared.. input.json jq -r --argjson size 2 ' .add |= unique_by({rngNum,state,postal}) | del(.add) as $object | (.add|_nwise($size) | ("\t", $object + {add:.} )) ' | awk ' /^\t/ {fn++; next} { print >> "part-" fn ".json"}' this is not optimal for performance. Need to see how this could be changed/tweaked to attain better performance. — Ilan
It's not too late to fold the additional information into the text of the Q here. — peak
You could consider using a tokenizer that will logically split JSON objects. Here is an example. — dawg
dwag, thanks for the reference. Looking more towards sh solution.. thanks — Ilan

ikegami ikegami · Accepted Answer · 2022-01-26T18:12:52

One could use

.add |= [ range(length) as $i | .[$i] | .rownum = $i+1 ]

Demo on jqplay

or

.add |= ( to_entries | map( .value.rownum = .key+1 | .value ) )

Demo on jqplay

Split/Slice large JSON sort free Unique by few columns & add additional element using jq

1 Answers