0
votes

I have two queries running on Amazon RedShift database. Both the queries are exactly same except the tables that they are referring to. The tables are different in terms of their distribution styles and sort keys.

It is a simple join operation.

Let me call the two queries as Q1 and Q2. When I view the query plan using EXPLAIN, their plans are similar except following two differences.

Q1: DS_BCAST_INNER and cost is also estimated to be way higher. Q2: DS_DIST_NONE and cost is much smaller than Q1

However, to my surprise, execution time for Q2 is coming out to be higher than Q1. Any ideas??

1
0 down vote In Redshift, DIST_KEY and SORT KEYS make a lot of difference in queries. Unless you give more details ( pasting explain plans), it would be difficult for anyone to give a meaningful answerRakesh Singh
Thank you, Rakesh for responding. Well, I figured out the issue. Like you said,it was also a question of sort keys (to make the processing of 'where' and 'group by' clauses faster).Dippy Aggarwal

1 Answers

1
votes

Taken from Redshift documentation (http://docs.aws.amazon.com/redshift/latest/dg/c-the-query-plan.html), the cost is:

The relative cost of the operation. Cost is a measure that compares the relative execution times of the steps within a plan. Cost does not provide any precise information about actual execution times or memory consumption, nor does it provide a meaningful comparison between execution plans.

So basically you can't compare two queries based on the cost that shows up on the EXPLAIN plan.