1
votes

per Split/Slice large JSON using jq we are able to successfully slice huge input file into smaller chunk of data based on array size..

Would like to add a new json element to it with incrementing sequence number based on length of original array along with filter/unique per few columns.

Input:

{"recDt":"2021-01-05",
 "country":"US",
 "name":"ABC",
 "number":"9828",
 "add": [
     {"evnCd":"O","rngNum":"1","state":"TX","city":"ANDERSON","postal":"77830"},
     {"evnCd":"O","rngNum":"2","state":"TX","city":"ANDERSON","postal":"77830"},
     {"evnCd":"O","rngNum":"3","state":"TX","city":"ANDERSON","postal":"77831"},
     {"evnCd":"O","rngNum":"4","state":"TX","city":"ANDERSON","postal":"77832"}
 ]
}

Expected Output: After performing adding of additional key

{"recDt":"2021-01-05",
 "country":"US",
 "name":"ABC",
 "number":"9828",
 "add": [
     {"rownum":1,"evnCd":"O","rngNum":"1","state":"TX","city":"ANDERSON","postal":"77830"},
     {"rownum":2,"evnCd":"O","rngNum":"2","state":"TX","city":"ANDERSON","postal":"77830"},
     {"rownum":3,"evnCd":"O","rngNum":"3","state":"TX","city":"ANDERSON","postal":"77831"},
     {"rownum":4,"evnCd":"O","rngNum":"4","state":"TX","city":"ANDERSON","postal":"77832"}
 ]
}

After performing filter (by State, City, Postal) and slice per array size of 2

{"recDt":"2021-01-05",
 "country":"US",
 "name":"ABC",
 "number":"9828",
 "add": [
     {"rownum":1,"evnCd":"O","rngNum":"1","state":"TX","city":"ANDERSON","postal":"77830"},
     {"rownum":3,"evnCd":"O","rngNum":"3","state":"TX","city":"ANDERSON","postal":"77831"}]}

{"recDt":"2021-01-05",
 "country":"US",
 "name":"ABC",
 "number":"9828",
 "add": [
     {"rownum":4,"evnCd":"O","rngNum":"4","state":"TX","city":"ANDERSON","postal":"77832"}
 ]
}

Below sample was used to filer/unique by few columns, not attaining optimal performance

input.json jq -r --argjson size 2 ' .add |= unique_by({city,state,postal}) | del(.add) as $object | (.add|_nwise($size) | ("\t", $object + {add:.} )) ' | awk ' /^\t/ {fn++; next} { print >> "part-" fn ".json"}'
1
Not a free coding service. What have you done and what specifically are the issues you are having. Try yourself first.dawg
Not looking for any free coding service.. if you have looked at earlier post referenced here, tried code is very much shared.. input.json jq -r --argjson size 2 ' .add |= unique_by({rngNum,state,postal}) | del(.add) as $object | (.add|_nwise($size) | ("\t", $object + {add:.} )) ' | awk ' /^\t/ {fn++; next} { print >> "part-" fn ".json"}' this is not optimal for performance. Need to see how this could be changed/tweaked to attain better performance.Ilan
It's not too late to fold the additional information into the text of the Q here.peak
You could consider using a tokenizer that will logically split JSON objects. Here is an example.dawg
dwag, thanks for the reference. Looking more towards sh solution.. thanksIlan

1 Answers

0
votes

One could use

.add |= [ range(length) as $i | .[$i] | .rownum = $i+1 ]

Demo on jqplay

or

.add |= ( to_entries | map( .value.rownum = .key+1 | .value ) )

Demo on jqplay