We have a 2-node Redshift cluster with a table with around 100M records. We marked a timestamp column as the sortkey - because the queries are always time restricted. However, our use-case requires the results has to be sorted in the descending order (on the sortkey).
After some benchmarking, we noticed that the average time taken around 10s. However, when the reverse ordering was removed, the average time came down to under 1s.
Is it possible to reverse the order of the sortkey to be of descending order? The official documentation doesn't seem to indicate that is possible. However I tried putting this while creating a new table:
sortkey(start_time DESC)
There were no errors but it doesn't seem to have any effect.
EDIT: Added the result of EXPLAIN statement on the queries.
The query with order_by ASC
explain select * from kcdr_sr_desc where user_id=396747 and start_time > '2016-01-01' and start_time < '2016-07-01' order by start_time limit 20;
Result:
XN Limit (cost=0.00..10.86 rows=20 width=300) -> XN Merge (cost=0.00..709235.56 rows=1306585 width=300) Merge Key: start_time -> XN Network (cost=0.00..709235.56 rows=1306585 width=300) Send to leader -> XN Seq Scan on kcdr_sr_desc (cost=0.00..709235.56 rows=1306585 width=300) Filter: ((user_id = 396747) AND (start_time > '2016-01-01 00:00:00'::timestamp without time zone) AND (start_time < '2016-07-01 00:00:00'::timestamp without time zone))
The query with order_by DESC
explain select * from kcdr_sr_desc where user_id=396747 and start_time > '2016-01-01' and start_time < '2016-07-01' order by start_time desc limit 20
Result:
XN Limit (cost=1000000841967.42..1000000841967.47 rows=20 width=300) -> XN Merge (cost=1000000841967.42..1000000845233.88 rows=1306585 width=300) Merge Key: start_time -> XN Network (cost=1000000841967.42..1000000845233.88 rows=1306585 width=300) Send to leader -> XN Sort (cost=1000000841967.42..1000000845233.88 rows=1306585 width=300) Sort Key: start_time -> XN Seq Scan on kcdr_sr_desc (cost=0.00..709235.56 rows=1306585 width=300) Filter: ((user_id = 396747) AND (start_time > '2016-01-01 00:00:00'::timestamp without time zone) AND (start_time < '2016-07-01 00:00:00'::timestamp without time zone))