I have a large chunk of json which contains about 10 unique elements. Each of these elements contains an ID, a few other attributes, and a links attribute (some of which also have IDs). Is there a way that I can get only the top level ID in each element of the json using bash (and preferably no external libraries)?
Here is an example:
{
"page": {
"size": 10,
"number": 1,
"totalPages": 1,
"totalElements": 10,
"resultSetId": "TODO",
"duration": 999
},
"content": [
{
"id": "fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07",
"name": "volume 0",
"userTags": [],
"links": [
{
"rel": "whatever",
"href": "/whatever/67b46e10-21ed-4394-b706-9eb61d75933e",
"id": "67b46e10-21ed-4394-b706-9eb61d75933e"
},
{
"rel": "whatever_else",
"href": "/whatever_else/fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07/workflowList"
},
{
"rel": "stuff",
"href": "/stuff/fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07/planList"
},
{
"rel": "self",
"href": "/self/fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07",
"id": "fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07"
},
{
"rel": "container",
"href": "/container/575a0c38-c60a-4d52-ba38-cb20f4b6d9e7",
"id": "575a0c38-c60a-4d52-ba38-cb20f4b6d9e7"
},
{
"rel": "parent",
"href": "/parent/85b7f0e7-b946-4bc4-9ca6-582a5ca08c51",
"id": "85b7f0e7-b946-4bc4-9ca6-582a5ca08c51"
}
],
"discovered": false,
"lastUpdated": "2015-11-20T09:33:05.757-0800",
"nativeUri": null,
"vendor": null,
"suspended": [],
"enabled": [],
},
{
"id": "4292014f-01cd-4369-9cc0-7bf41a8be53d",
"name": "Storage_Group_001",
"attributes": {},
"userTags": [],
"links": [
{
"rel": "stuff",
"href": "/stuff/67b46e10-21ed-4394-b706-9eb61d75933e",
"id": "67b46e10-21ed-4394-b706-9eb61d75933e"
},
{
"rel": "something",
"href": "/something/4292014f-01cd-4369-9cc0-7bf41a8be53d/workflowList"
},
{
"rel": "whatever",
"href": "/whatever/4292014f-01cd-4369-9cc0-7bf41a8be53d/planList"
},
{
"rel": "self",
"href": "/self/4292014f-01cd-4369-9cc0-7bf41a8be53d",
"id": "4292014f-01cd-4369-9cc0-7bf41a8be53d"
},
{
"rel": "container",
"href": "/stuff/575a0c38-c60a-4d52-ba38-cb20f4b6d9e7",
"id": "575a0c38-c60a-4d52-ba38-cb20f4b6d9e7"
}
],
"lastUpdated": "2015-11-18T06:37:56.739-0800",
"nativeUri": null,
"vendor": null,
"suspended": [],
"enabled": [],
},
{
"id": "896aca64-17a6-4acb-a93c-562424dc1bc4",
"name": "volume 4",
"attributes": {},
...
So basically, I just want to get the top id for each section, but none of the ids in the links sections. I got close using awk, and also with perl, but it is impossible to predict the exact number of ids contained in the links section. Here was my awk attempt (Which assumed there were exactly 5 entries between desired ids. I also just dumped the json into a temp file so I didn't have to curl every time):
awk '{if (count++%5==0) print $0;}' <(cat tmp.txt | grep -Po '(?<="id":")[^"]*')
awkis external tobash, but it's still the wrong tool. Use something likejq. - chepner