1
votes

I am trying to replicate a table using CTAS clause in redshift by additionally specifying a primary key to the table.

Tried below syntax but no luck. However, I was able to specify DISTKEY/SORTKEY using the same syntax

create table date_dim
PRIMARY KEY(date_key)
--DISTKEY ( date_key )
as
   select date_key,
   calendar_date,.....;

I want to use primary key as part of merge logic I am designing in my flow.

TIA!

4

4 Answers

3
votes

Many people consider primary and foreign keys in Redshift to be an anti-pattern (because they're unenforced), but my team built a small tool (a Python script) that supports this scenario.

You write your select statement in a normal SQL file, define primary key, foreign keys, distkey, etc in a YAML configuration file, and then use the script to generate (and optionally execute) SQL to create and populate the table.

We also include an Airflow operator to make it simple to schedule and automate this.

The repo is here, and we wrote a bit more about it on our team blog

1
votes

You can only specify distkey and sortkey in CTAS . Here is the below link which describes what all options you can specify Redshift CTAS

0
votes

If the column you are wishing to dub as primary key is already non-nullable you can use this:

ALTER TABLE <table_name> ADD CONSTRAINT <a_name_for_this_constraint> PRIMARY KEY (<attribute_name>)

e.g.: ALTER TABLE member ADD CONSTRAINT pk_1 PRIMARY KEY (member_id);

-3
votes