Databricks Spark-Redshift: Sortkeys not working

Question

I am trying to add the sort keys from scala code by following instructions here: https://github.com/databricks/spark-redshift

df.write
  .format(formatRS)
  .option("url", connString)
  .option("jdbcdriver", jdbcDriverRS)
  .option("dbtable", table)
  .option("tempdir", tempDirRS + table)
  .option("usestagingtable", "true")
  .option("diststyle", "KEY")
  .option("distkey", "id")
  .option("sortkeyspec", "INTERLEAVED SORTKEY (id,timestamp)")
  .mode(mode)
  .save()

The sort keys are being implemented wrong because when I am checking the table info:

sort key = INTERLEAVEDˇ

I need the right way to add the sort keys.

Obadah Meslmani Obadah Meslmani · Accepted Answer · 2017-05-29T16:29:01

There is no wrong with the implementation, the wrong is from the "checking query" it returns

sort key = interleavedˇ

which is confusing enough to believe that there is something wrong happening.

so if you need to check the interleaved sort keys you should run this query:

select tbl as tbl_id, stv_tbl_perm.name as table_name, 
col, interleaved_skew, last_reindex
from svv_interleaved_columns, stv_tbl_perm
where svv_interleaved_columns.tbl = stv_tbl_perm.id
and interleaved_skew is not null;

Databricks Spark-Redshift: Sortkeys not working

1 Answers