Let me ask about what can be approptiate distkey(dist style) and sort key for master tables in Redshift.
We have several master tables with differnet size, purpose and cardinalities and now we're verify what SORTKEY and DISTKEY combination would be more appropriate.
For example, we have two main key in our customer master table, frequent use in our BI queries and join key for other tables, as follows;
Ex. MST_CUSTOMER (about 7 millions, 850MB in SQL Server)
CUSTOMER_ID <-- Primary Key, Unique and High Cardinality
CUSTOMER_CATEGORY <-- Low Cardinality (like VIP, BZ) but not always uses in BI query
CUST_NAME
CUST_ADDRESS...
First, if we can use both columns for SORTKEY, which is the best order in this case?
1.Should Low cardinality comes first (CUST_CATEGORY, CUST_ID)
2.Hifh frequency comes first (CUST_ID, CUST_CATEGORY)
3.should chose only single column eigher of CUST_ID or CUST_CATEGORY
And generally, may I ask what is the best combination in such case?
1.DISTSTYLE KEY & COMPOUND SORTKEY
2.DISTSTYLE KEY & SORTKEY
3.DISTSTYLE ALL & INTERLEAVED SORTKEY...
I imagine if we would find 'DS_BCAST_INNER' in query plan frequently, we'd better to consider 'DISTSTYLE ALL'. But still not sure what can be best practice for master tables.
Any suggestion would be appreciated again.
Best Regards