In Redshift database, I want to decide a sort key for a dimension table between surrogatekey and natural primary key. The definition says "Sort keys should be selected based on the most commonly used columns when filtering, ordering or grouping the data". My Question is -
I have a Employee table with (Emp_key,Emp_Id,Emp_name) and this table is joined to Fact table on Emp key. Here "Emp_key" is the surrogate key and "Emp_id" is the natural primary key. And I filter the query on Emp_id but "Emp_key" in the fact table is defined as a "dist key" and read that for a large dimension defining sort & dist keys on the join keys results in better performance and so I want to know which one should i choose between Emp_key and Emp_id for Sort key in a dimension table?
And also, another confusion is choosing sort for the "date" dimension table between "date_key" or ignore defining sort key.
I would appreciate your suggestions in this regard.
Thank you!