I'm trying to join two tables in redshift. One big and one small. The join is by their id, and I've distributed the big one on the cluster by the column (used for the join) and I used as sortkey this column too. The small one table I have distributed whole on all nodes and used sortkey with the column used for the join.
Example:
create table table_small diststyle all SORTKEY(id) as select * from another_small_table;
create table big_table distkey (id) diststyle key SORTKEY(id) as SELECT * from another_big_table;
explain SELECT * FROM big_table big JOIN small_table small ON big.id = small.id;
The query plan say that redshift is doing a Hash join instead of merge join. This is the expected behaviour? I expected merge join.