In Redshift, if dimension table and fact table have their PK and FK (respectively) as distribution key, but no foreign key constraint is declared in the fact table's DDL, will the tables be collocated and joined without any redistribution?
1 Answers
2
votes
First, it's worth pointing out that Amazon Redshift does not enforce foreign constraints, but they are used to optimize queries. See: Define Primary Key and Foreign Key Constraints
Colocation
If a table has a Distribution Key, then the rows are distributed across all slices, but rows relating to a specific DISTKEY value will all be located on the same slice.
The DISTKEY uses a hashing algorithm to determine which slice stores the data. This is common between tables. For example:
- Let's say there are two tables:
Table-A
andTable-B
Table-A
has DISTKEY set to columnPK
Table-B
has DISTKEY set to columnFK
- A row is stored in
Table-A
withPK = 100
- A row is stored in
Table-B
withFK = 100
The data for the rows in both tables will be located on the same slice. This is true even without a Foreign Key Constraint defined between the tables.