0
votes

In Redshift, if dimension table and fact table have their PK and FK (respectively) as distribution key, but no foreign key constraint is declared in the fact table's DDL, will the tables be collocated and joined without any redistribution?

1

1 Answers

2
votes

First, it's worth pointing out that Amazon Redshift does not enforce foreign constraints, but they are used to optimize queries. See: Define Primary Key and Foreign Key Constraints

Colocation

If a table has a Distribution Key, then the rows are distributed across all slices, but rows relating to a specific DISTKEY value will all be located on the same slice.

The DISTKEY uses a hashing algorithm to determine which slice stores the data. This is common between tables. For example:

  • Let's say there are two tables: Table-A and Table-B
  • Table-A has DISTKEY set to column PK
  • Table-B has DISTKEY set to column FK
  • A row is stored in Table-A with PK = 100
  • A row is stored in Table-B with FK = 100

The data for the rows in both tables will be located on the same slice. This is true even without a Foreign Key Constraint defined between the tables.