7
votes

PostgreSQL can work with array subscripts starting anywhere.
Consider this example that creates an array with 3 elements with subscripts from 5 to 7:

SELECT ('[5:7]={1,2,3}'::int[]);

Returns:

[5:7]={1,2,3}

Meaning, for instance, that you get the first element with

SELECT ('[5:7]={1,2,3}'::int[])[5];

I want to normalize any given 1-dimensional array to start with array subscript 1.
The best I could come up with:

SELECT ('[5:7]={1,2,3}'::int[])[array_lower('[5:7]={1,2,3}'::int[], 1):array_upper('[5:7]={1,2,3}'::int[], 1)]

Or, the same, easier the read:

WITH x(a) AS (
    SELECT '[5:7]={1,2,3}'::int[]
    )
SELECT a[array_lower(a, 1):array_upper(a, 1)]
FROM   x

Do you know a simpler / faster or at least more elegant way?


Benchmark

For the purpose of testing performance I whipped up this quick benchmark.
Table with 100k rows, simple integer array of random length between 1 an 11:

CREATE TEMP TABLE t (a int[]);
INSERT INTO t -- now with actually varying subscripts
SELECT ('[' || g%10 || ':' || 2*(g%10) || ']={1'
            || repeat(','||g::text, g%10) || '}')::int[]
FROM   generate_series(1,100000) g;

EXPLAIN ANALYZE
SELECT 
       substring(a::text, '{.*$')::int[]       -- Total runtime: 949.304 ms
--     a[-2147483648:2147483647]               -- Total runtime: 283.877 ms
--     a[array_lower(a, 1):array_upper(a, 1)]  -- Total runtime: 311.545 ms
FROM   t

So, yes, @Daniel's idea is slightly faster.
@Kevin's text conversion works, too, but doesn't earn many points.

Any other ideas?

3
Slicing (as @DanielVérité suggested) was the first thing that came to mind for me. Of course you could get the fastest transformation by writing a C function, although it might be pretty close to the slicing timing. The only other alternative that comes to mind would be to cast the array to text, parse out the substring to the right of the =, and cast that back to an array of the right type. And I'm pretty sure that's uglier and more fragile than the slicing. - kgrittn
Erwin has posted the newer, more elegant solution for newer versions. But... for anyone playing with code here (for learning purposes only), some code posted here doesn't run (anymore... if it ever did) as provided. Edit queue is full, so I'll provide corrections here over several comments, for easier reading. What's missing is a set of parentheses around a few key values. At least, that's my experience, running these queries in dbeaver. - Wellspring
SELECT ('[5:7]={1,2,3}'::int[])[(array_lower('[5:7]={1,2,3}'::int[], 1)):(array_upper('[5:7]={1,2,3}'::int[], 1))] - Wellspring
WITH x(a) AS ( SELECT '[5:7]={1,2,3}'::int[] ) SELECT a[((array_lower(a, 1))):((array_upper(a, 1)))] FROM x - Wellspring

3 Answers

6
votes

There is a simpler method that is ugly, but I believe technically correct: extract the largest possible slice out of the array, as opposed to the exact slice with computed bounds. It avoids the two function calls.

Example:

select ('[5:7]={1,2,3}'::int[])[-2147483648:2147483647];

results in:

  int4   
---------
 {1,2,3}
5
votes

Eventually, something more elegant popped up with Postgres 9.6. The manual:

It is possible to omit the lower-bound and/or upper-bound of a slice specifier; the missing bound is replaced by the lower or upper limit of the array's subscripts. For example:

So it's simple now:

SELECT my_arr[:];

With my example array literal you need enclosing parentheses to make the syntax unambiguous::

SELECT ('[5:7]={1,2,3}'::int[])[:];

About the same performance as Daniel's solution with hard-coded max array subscripts - which is still the way to go with Postgres 9.5 or earlier.

3
votes

Not sure if this is already covered, but:

SELECT array_agg(v) FROM unnest('[5:7]={1,2,3}'::int[]) AS a(v);

To test performance I had to add id column on the test table. Slow.