2
votes

I have a case when my data in in nested arrays of jsonb in order to find the value I have to do multiple JSONB_ARRAY_ELEMENTS which is costly and takes a lots of nested code.

The json file has the continents inside countries and inside cities. I need to access a city value.

  1. Is there a way to make this query simpler and faster?
  2. I was trying to solve it using JSON_EXTRACT_PATH but in order to get in to a array but I need the indexes.
WITH mydata AS (
    SELECT '
    {
   "continents":[
      {
         "name":"America",
         "area":43316000,
         "countries":[
            {
               "country_name":"Canada",
               "capital":"Toronto",
               "cities":[
                  {
                     "city_name":"Ontario",
                     "population":2393933
                  },
                  {
                     "city_name":"Quebec",
                     "population":12332
                  }
               ]
            },
            {
               "country_name":"Brazil",
               "capital":"Brasilia",
               "cities":[
                  {
                     "city_name":"Sao Paolo",
                     "population":34534534
                  },
                  {
                     "city_name":"Rio",
                     "population":445345
                  }
               ]
            }
         ]
      },
      {
         "name":"Europa",
         "area":10530751,
         "countries":[
            {
               "country_name":"Switzerland",
               "capital":"Zurich",
               "cities":[
                  {
                     "city_name":"Ginebra",
                     "population":4564565
                  },
                  {
                     "city_name":"Basilea",
                     "population":4564533
                  }
               ]
            },
            {
               "country_name":"Norway",
               "capital":"Oslo",
               "cities":[
                  {
                     "city_name":"Oslo",
                     "population":3243534
                  },
                  {
                     "city_name":"Steinkjer",
                     "population":4565465
                  }
               ]
            }
         ]
      }
   ]
}
'::JSONB AS data_column
)
SELECT cit.city->>'city_name' AS city,
       (cit.city->>'population')::INTEGER AS population
  FROM (SELECT JSONB_ARRAY_ELEMENTS(coun.country->'cities') AS city
          FROM (SELECT JSONB_ARRAY_ELEMENTS(cont.continent->'countries') AS country
                  FROM (SELECT JSONB_ARRAY_ELEMENTS(data_column->'continents') AS continent
                          FROM mydata
                       ) AS cont
                 WHERE cont.continent @> '{"name":"Europa"}'
               ) AS coun
         WHERE coun.country @> '{"country_name" : "Norway"}'
       ) AS cit
 WHERE cit.city @> '{"city_name": "Oslo"}'

See my nested queries? looks ugly, I can get the answer using: JSONB_EXTRACT_PATH( data_column->'continents', '1', 'countries', '1', 'cities', '0', 'population') but I had to hardcode the array indexes.

Hope you can help me out.

Thanks.

1

1 Answers

3
votes

You don't need any nesting, you can do lateral queries:

SELECT
  city->>'city_name' AS city,
  (city->>'population')::INTEGER AS population
FROM
  mydata,
  JSONB_ARRAY_ELEMENTS(data_column->'continents') AS continent,
  JSONB_ARRAY_ELEMENTS(continent->'countries') AS country,
  JSONB_ARRAY_ELEMENTS(country->'cities') AS city
WHERE continent ->> 'name' = 'Europa'
  AND country ->> 'country_name' = 'Norway'
  AND city ->> 'city_name' = 'Oslo';

(online demo)

However, since you mentioned paths and having to specify indices in there, this is actually the perfect use case for Postgres 12 JSON paths:

SELECT jsonb_path_query(data_column, '$.continents[*]?(@.name == "Europa").countries[*]?(@.country_name=="Norway").cities[*]?(@.city_name=="Oslo")') FROM mydata

(online demo)