I initially ran into this when I was selecting from one table with an identity, primary key and sort key into another table with its own set of identity, primary, sort. Instead of respecting the (1,1) identity as it was defined, it doing (1,8) (sometimes 3,8). I think it might be because the original table was sorted? In trying to figure out what was going on, I made a much simpler query and data and found a reproducible example across multiple redshift clusters. Take this test example:
drop table if exists test;
create temp table test (id int identity(1,1) not null
, value varchar(16)
, primary key (id))
diststyle all
sortkey (id);
insert into test (value) select 'a';
insert into test (value) select 'b';
insert into test (value) select 'c' union select 'd';
insert into test (value) values ('e'), ('f'), ('g');
select * from test;
The output I get is:
id value
1 a
2 b
9 c
10 d
3 e
4 f
5 g
You'll notice the identity column is not incrementing correctly. I had friends on other clusters try this, they got 20, 27 and 65, 60 for the c and d columns, while the other columns are in order. Please note that the output is still "sorted" correctly, by the sortkey/order of input, despite that the id column isn't physically in order.
The only similarity I can think of between the weird original results I got when first finding this and the test query is that unions are sorted and my table had a sortkey on it.
Other thoughts as to why this is happening and how to fix it are welcome.