2
votes

I am new to Redhsift and migrting oracle to Redshift.

One of the oracle tables have around 60 indexes. AWS recommends its a good practice to have around 6 compound sort keys.

How would these 60 oracle indexes translate to Redhsift sort keys ? I understand there is no automated conversion or can't have all 60 of them as compound sort keys. I am new to redshift and May I know , how usually this conversion is approached.

In Oracle we can keep adding indexes to the same table and the queries / reports can use them. But in Redshift Changing sortkeys is through recreating the table. How do we make all queries which uses different filter columns and join columns on the same table have best performance?

Thanks

2
please accept my answer below if you think it is correct.Jon Scott

2 Answers

1
votes

Redshift is columnar database, and it doesn't have indexes in the same meaning as in Oracle at all.

You can think of Redshift's compound sort key (not interleaved) as IOT in Oracle (index organized table), with all the data sorted physically by this compound key.

If you create interleaved sort key on x columns, it will act as a separate index on each of x columns in some manner.

In any way, being columnar database, Redshift can outperform Oracle in many aggregation queries due to it's compression and data structure. The main factors that affect performance in Redshift are distribution style and key, sort key and columns encoding.

If you can't fit all your queries with one table structure, you can duplicate the table with different structure, but the same data. This approach is widely used in big data columnar databases (for example projections in Vertica) and helps to achieve performance with storage being the cost.

Please review this page with several useful tips about Redshift performance: https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-techniques-for-amazon-redshift/

0
votes

First a few key points

Redshift <> Oracle

Redshift does not have indexes, Redshift Sort Keys <> Oracle Indexes. Hopefully, you are not expecting Redshift to replace Oracle for your OLTP workload. most of those 60 indexes are likely to be for optimising OLTP type workload.

Max Redshift sortkeys per table = 1

You cannot sort your Redshift data in more than 1 way! the sort key orders your table data. It is not an index. You can specify an interleaved or a compound sort key.

Query Tuning Hopefully, you will be using Redshift for analytical type queries. You should define sort and distribution based upon your expected queries. you should follow the best practice here and the tutorial here

Tuning Redshift is partly an art, you will need to use trial and error!

If you want specific guidance on this, please can you edit your question to be specific on what you are doing?