2
votes

i am struggling with my dashboard performance which runs queries on Redshift using JDBC driver.

the query is like -

select <ALIAS_TO_SCHEMA.TABLENAME>.<ANOTHER_COLUMN_NAME> as col_0_0_, 
sum(<ALIAS_TO_SCHEMA.TABLENAME>.devicecount) as col_1_0_ from <table_schema>.<table_name> <ALIAS_TO_SCHEMA.TABLENAME> where <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$1 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$2 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$3 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$4 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$5 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$6 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$7 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$8 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$9 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$10 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$11 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$12 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$13 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$14 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$15 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$16 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$17 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$18 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$19 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$20 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$21 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$22 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$23 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$24 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$25 
 or <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME>=$26 
 or ........

The For dashboard we use Spring, Hibernate ( I am not 100% sure about it though ). But the query might sometimes stretch till $1000 + according to the filters/options being selected on the UI.

But the problem we are seeing is - The First Time this query is being run by the reports, it takes more than 40 sec - 60 seconds for the response. After the first time , the query runs quite fast and takes only few seconds to run.

We initially suspected there must be something wrong with redshift caching , but it turns out that , Even simple queries like these ( But Huge ) takes considerable time to COMPILE, which is clear when we look into the svl_compile table which shows this query was compiled in over 35 seconds.

What should I do to handle such issues ?

1

1 Answers

1
votes

Recommend restructuring the query generated by your dashboard to use an IN list. Redshift should be able to reuse the already compiled query segments for different length IN lists.

Note that IN lists with less than 10 values will still be evaluated as OR. https://docs.aws.amazon.com/redshift/latest/dg/r_in_condition.html#r_in_condition-optimization-for-large-in-lists

SELECT <ALIAS_TO_SCHEMA.TABLENAME>.<ANOTHER_COLUMN_NAME> as col_0_0_
     , SUM(<ALIAS_TO_SCHEMA.TABLENAME>.devicecount) AS col_1_0_
FROM <table_schema>.<table_name> <ALIAS_TO_SCHEMA.TABLENAME>
WHERE <ALIAS_TO_SCHEMA.TABLENAME>.<COLUMN_NAME> IN ( $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11 … $1000 ) 
;