4
votes

On Amazon Redshift, do concurrent queries affect each others performance?

For example, lets say there are two queries: one on a relatively small table (~5m rows) retrieving all rows, and another on a large table (~500m) rows. Both tables have the same fields, both have no compression. Both queries retrieve all data in their respective tables to compute their results. There are no joins or filters. Both queries retrieve about 2-4 fields for their computations.

Running by itself, the small query returns in about 700ms. However, while the large query is running (which by itself takes a few minutes), the small query returns in 4-6 seconds.

This is the observed behavior on a cluster with a single XL node.

Is this the expected behavior? Is there a configuration setting that will promise performance consistency of the small query, even if the large query is running?

2

2 Answers

4
votes

Copy-pasted from: https://forums.aws.amazon.com/thread.jspa?threadID=137540#

I've performed some concurrent query benchmarking.

I created a straightforward query which by itself took about a minute to run. I then ran one of those queries at once, then two, them three, etc, and timed each query.

Each query basically halved database performance - e.g. what you'd expect; double the load, halve the performance.

Actually, it's a bit better than halving - you get about an extra 10% performance.

This performance behaviour held true up to 5 concurrent queries, which is the max number of concurrent queries configured on the database I was working with. If I ran six queries, the final query could not execute until one of the first queries had finished and freed up a slot.

Finally, vacuum acts much like a normal query - it halves performance. It's not special.

Actually, vacuum is something more than halving - it's equivelent to a pretty heavy query.

2
votes

There are no guarantees because all of this is running on a fixed number of CPUs. With a fixed capacity of work when you increase the work it lowers the throughput. The short answer is get a bigger machine (ie more nodes).

Here is the specifics of your answer:

https://forums.aws.amazon.com/message.jspa?messageID=437015#

http://docs.aws.amazon.com/redshift/latest/dg/c_workload_mngmt_classification.html