I have a Postgres table with a large jsonb column.
CREATE TABLE mytable (id integer, my_jsonb jsonb)
The my_jsonb column contains data like this:
{
name: 'Bob',
city: 'Somecity',
zip: '12345'
}
The table contains several million rows.
I need to create facets, i.e. aggregations, on individual fields in our user interface. For example:
city | count
New York | 1000
Chicago | 3000
Los Angeles | 4000
maybe 200 more values...
My current query, which yields the correct results, looks like this:
select my_jsonb->>'city', count(*)
from mytable
where foo='bar'
group by my_jsonb->>'city'
order by my_jsonb->>'city'
The problem is that it is painfully slow. It takes 5-10 seconds, depending on the particular column that I pick. It has to do a full table scan and extract each jsonb value, row by row.
Question: how do I create an index that does this query efficiently, and works no matter which jsonb field I choose?
A GIN index doesn't work. The query optimizer doesn't use it. Same for a simple BTREE on the jsonb column.
I'm thinking that there might be some kind of expression index, and I might be able to rewrite the facet query to use the expression, but I haven't figured it out.
Worst case, I can extract all of the values into a second table and index that, I'd prefer not to.