0
votes

I have an RDS instance: db.t2.xlarge (4 vCPU, 16 GiB memory) (max_parallel_workers_per_gather=2)

And another Redshift instance: dc2.large (2 vCPU, 15 GiB memory) (2 nodes)

I have loaded both with same data. I have tried different types of queries, but my RDS always runs queries in shorter time.

Example:

  • I have a table called 'az_product_questions'
  • It has two columns product_id (int) and question_id (int)
  • I inserted it with 1,258,559 rows.

The query I am trying is as follows:

SELECT
  product_id,
  COUNT(*)
FROM az_product_questions
GROUP BY product_id
ORDER BY product_id ASC;
  • In RDS: I didn't add any indexes.
  • In Redshift: product_id is distkey and sortkey

Coming to performance:

  1. On RDS: Avg. 5 seconds
  2. On Redshift: Avg. 6.5 seconds

Am I trying to use Redshift for smaller database, in a wrong use-case, or in a wrong way? Or It's normal?

1

1 Answers

0
votes

A normal database starts to slow down when it has a million rows in a table. Amazon Redshift can handle tables with billions of rows.

For this particular use-case, a normal database would work fine. However, when you add more rows and more columns, a columnar database like Amazon Redshift would show better performance.

So, stick with a database while it performs well for you, then consider using Redshift if you have higher analytic needs. Please note that Redshift is great for reporting, but should not be used with an application that does lots of inserts and updates.