2
votes

Postgresql 10 jsonb to table with multiple rows

I have a a json array that is containing some "users" structure. These json arrays are stroe in a Postgresql database and I would like to get all lines (for all rows for all arrays).

Data sample:

My database instance:

docker run --name postgresql-10 -e POSTGRES_PASSWORD=mysecretpassword -d postgres:10-alpine 

docker run -it --rm --link postgresql-10:postgres postgres:10-alpine psql -h postgres -U postgres

My table:

CREATE TABLE "Reports" (
    "name" TEXT NOT NULL,
    "report" JSONB NOT NULL,
    "timestamp" TIMESTAMP NOT NULL,
    PRIMARY KEY ("name", "timestamp")
)
;

Some data:

insert into "Reports" (timestamp, "name", "report")
  values ('2017-11-05'::timestamp, 
          'appA', 
          '{
         "dateComputed": "2017-11-06 10:06:29 UTC",
         "name": "appA",
         "users": [
           {
             "DATE": "2017-11-03",
             "EMPLID": "415",
             "NAME": "Smith"
               },
               {
             "DATE": "2017-11-03",
         "EMPLID": "4",
         "NAME": "Jane"
           },
           {
             "DATE": "2017-11-03",
             "EMPLID": "31",
             "NAME": "Doe"
           }
         ]
      }'::jsonb
         ) ;

insert into "Reports" (timestamp, "name", "report")
  values ('2017-11-04'::timestamp, 
          'appA', 
          '{
         "dateComputed": "2017-11-04 11:34:13 UTC",
         "name": "appA",
         "users": [
               {
             "DATE": "2017-11-03",
         "EMPLID": "4",
         "NAME": "Jane"
           },
           {
             "DATE": "2017-11-03",
             "EMPLID": "31",
             "NAME": "Doe"
           }
         ]
      }'::jsonb
         ) ;

insert into "Reports" (timestamp, "name", "report")
  values ('2017-11-01'::timestamp, 
          'appA', 
          '{
         "dateComputed": "2017-11-01 02:32:49 UTC",
         "name": "appA",
         "users": [
               {
             "DATE": "2017-11-01",
         "EMPLID": "415",
         "NAME": "Smith"
           },
           {
             "DATE": "2017-11-01",
             "EMPLID": "31",
             "NAME": "Doe"
           }
         ]
      }'::jsonb
         ) ;


insert into "Reports" (timestamp, "name", "report")
  values ('2017-11-03'::timestamp, 'appB', '[{"other": "useless"}]'::jsonb) ;

What I would like is following table listing all users matching "Reports" whose "name" is 'AppA' :

+------------+-------+--------+
| DATE       | NAME  | EMPLID |
+------------+-------+--------+
| 2017-11-03 | Smith | 415    |
+------------+-------+--------+
| 2017-11-03 | Jane  | 4      |
+------------+-------+--------+
| 2017-11-03 | Doe   | 31     |
+------------+-------+--------+
| 2017-11-03 | Jane  | 4      |
+------------+-------+--------+
| 2017-11-03 | Doe   | 31     |
+------------+-------+--------+
| 2017-11-01 | Smith | 415    |
+------------+-------+--------+
| 2017-11-01 | Doe   | 31     |
+------------+-------+--------+

+------------+------------+-------+--------+
| timestamp  | DATE       | NAME  | EMPLID |
+------------+------------+-------+--------+
| 2017-11-05 | 2017-11-03 | Smith | 415    |
+------------+------------+-------+--------+
| 2017-11-05 | 2017-11-03 | Jane  | 4      |
+------------+------------+-------+--------+
| 2017-11-05 | 2017-11-03 | Doe   | 31     |
+------------+------------+-------+--------+
| 2017-11-04 | 2017-11-03 | Jane  | 4      |
+------------+------------+-------+--------+
| 2017-11-04 | 2017-11-03 | Doe   | 31     |
+------------+------------+-------+--------+
| 2017-11-03 | 2017-11-01 | Smith | 415    |
+------------+------------+-------+--------+
| 2017-11-03 | 2017-11-01 | Doe   | 31     |
+------------+------------+-------+--------+

When I am only matching a single row then I am able to use jsonb_to_recordset to get all json lines matching that row. For instance when filtering on latest timestamp column by creating a view:

CREATE INDEX "ReportsGIN" on "Reports" USING gin ("report") ;

CREATE VIEW "Reports_Latest_timestamp"
AS
SELECT  "name"
       , max("Reports"."timestamp") AS "timestamp_latest"
FROM "Reports"
GROUP BY "name"
;

CREATE VIEW "Reports_Latest"
AS
SELECT   "Reports"."name"
       , "Reports"."report"
       , "Reports"."timestamp"
FROM "Reports"
WHERE ("Reports"."timestamp" = (SELECT "Reports_Latest_timestamp"."timestamp_latest" FROM "Reports_Latest_timestamp" WHERE "Reports_Latest_timestamp"."name" = "Reports"."name"))
;

select *
from
jsonb_to_recordset
(
 (select report#>'{users}'
  from "Reports_Latest" 
  where "name" = 'appA' 
) 
) as x(
        "EMPLID" integer
      , "NAME" text
      , "DATE" timestamp with time zone

)
;

 EMPLID | NAME  |          DATE
--------+-------+------------------------
    415 | Smith | 2017-11-03 00:00:00+00
      4 | Jane  | 2017-11-03 00:00:00+00
     31 | Doe   | 2017-11-03 00:00:00+00
(3 rows)

jsonb_to_recordset is working as expected.

How can I use jsonb_to_recordset to list lines for all "reports" row?

An answer to display "timestamp" on "Reports_Latest" would be (but still no clue for full "Reports" lines):

select  t."timestamp"
      , r."EMPLID"
      , r."NAME"
      , r."DATE"
from
(
  select "timestamp", report#>'{users}'
  from "Reports_Latest" 
  where "name" = 'appA' 
) as t
, (
  select *
  from
    jsonb_to_recordset
    (
     (select report#>'{users}'
      from "Reports_Latest" 
      where "name" = 'appA' 
    ) 
    ) as x(
            "EMPLID" integer
          , "NAME" text
          , "DATE" timestamp with time zone

    )
  ) as r
;


      timestamp      | EMPLID | NAME  |          DATE
---------------------+--------+-------+------------------------
 2017-11-05 00:00:00 |    415 | Smith | 2017-11-03 00:00:00+00
 2017-11-05 00:00:00 |      4 | Jane  | 2017-11-03 00:00:00+00
 2017-11-05 00:00:00 |     31 | Doe   | 2017-11-03 00:00:00+00
(3 rows)

SQL Fiddle on Postgresql 9.6 to quick test


The solution provided by Breathe is:

select r."timestamp", x.*
from "Reports" as r
cross join lateral jsonb_to_recordset (r.report#>'{users}')
 as x(
        "EMPLID" integer
      , "NAME" text
      , "DATE" timestamp with time zone

)
where r."name" = 'appA' 
;

      timestamp      | EMPLID | NAME  |          DATE
---------------------+--------+-------+------------------------
 2017-11-05 00:00:00 |    415 | Smith | 2017-11-03 00:00:00+00
 2017-11-05 00:00:00 |      4 | Jane  | 2017-11-03 00:00:00+00
 2017-11-05 00:00:00 |     31 | Doe   | 2017-11-03 00:00:00+00
 2017-11-04 00:00:00 |      4 | Jane  | 2017-11-03 00:00:00+00
 2017-11-04 00:00:00 |     31 | Doe   | 2017-11-03 00:00:00+00
 2017-11-01 00:00:00 |    415 | Smith | 2017-11-01 00:00:00+00
 2017-11-01 00:00:00 |     31 | Doe   | 2017-11-01 00:00:00+00
(7 rows)

http://sqlfiddle.com/#!17/cd4df/9/0

1

1 Answers

1
votes

Essentially, what you are trying to do is to generate as many record as there are users in one single row of "Reports". In your desired table structure, the first two columns are the first two columns from "Reports". So your query needs:

select a.timestamp, a."name"
FROM "Reports" a

Then you want to create a "subset" of records for each of these. This is acheivable by applying the functiont that generate the subset to litterally all the rows. The subset is generated by the function jsonb_to_recordset(), therefore :

SELECT a.timestamp, a."name", b. *
FROM "Reports" a
CROSS JOIN lateral jsonb_to_recordset(a.report->'Users')
as b("EMPLID" integer
      , "NAME" text
      , "DATE" timestamp with time zone)

EDIT: I added the cross join lateral