How small should a table using Diststyle ALL be in Amazon Redshift?

Question

It says here: http://dwbitechguru.blogspot.com/2014/11/performance-tuning-in-amazon-redshift.html that for vey small tables, redshift should use diststyle ALL instead of EVEN or KEY. How Small is small? If I was to specify a row number in the where clause of the query: select relname, reldiststyle from pg_class how many rows should I specify?

rohitkulky rohitkulky · Accepted Answer · 2016-01-09T06:22:33

It really depends on the cluster size you are using. DISTSTYLE ALL will copy the data of your table to all nodes - to mitigate data transfer requirement across nodes. You can find out the size of your table and Redshift nodes available size, if you can afford to copy table multiple times per node, do it!

Also, if you have a requirement of joining other tables with this table very very frequently, like in 70% of your queries, I believe it is worth the space if you want better query performance.

If your Join keys across tables are same in terms of cardinality, then you can also afford to distribute all tables on that key so that similar keys lie in same node which will obviate replication of data.

I would suggest trying out the two options above, and comparing average query run times of around 10 queries and then come to a decision.

How small should a table using Diststyle ALL be in Amazon Redshift?

2 Answers