Create table as in Redshift defining primary key

Question

I am trying to replicate a table using CTAS clause in redshift by additionally specifying a primary key to the table.

Tried below syntax but no luck. However, I was able to specify DISTKEY/SORTKEY using the same syntax

create table date_dim
PRIMARY KEY(date_key)
--DISTKEY ( date_key )
as
   select date_key,
   calendar_date,.....;

I want to use primary key as part of merge logic I am designing in my flow.

TIA!

jbm jbm · Accepted Answer · 2018-12-12T16:32:02

Many people consider primary and foreign keys in Redshift to be an anti-pattern (because they're unenforced), but my team built a small tool (a Python script) that supports this scenario.

You write your select statement in a normal SQL file, define primary key, foreign keys, distkey, etc in a YAML configuration file, and then use the script to generate (and optionally execute) SQL to create and populate the table.

We also include an Airflow operator to make it simple to schedule and automate this.

The repo is here, and we wrote a bit more about it on our team blog

Create table as in Redshift defining primary key

4 Answers