Redshift cluster bigger because of running query

Question

Due to a query that was running for more than 100 hours, in Aginity we saw that our cluster size went from 1 to 5 TB.

By inspecting svv_table_info, we saw that the size of each table was way bigger than what we used to see in the past. After that, we checked AWS console, and we saw that size increase started 5 days ago, at the same time that the 100 hours running query has started.

After killing the query, few minutes after Redshift size went back to 1 TB, and each table size went back to normal.

Why is this happening?

Just for the record, 100 hours running query did not involve all of the tables who's size increased dramatically while the query was running.

EDITED

I cannot now really reproduce the error. But the steps were the following:

In Aginity, I accidentally see that size of the cluster is 5TB, even though the cluster has only 2 x ds2.xlarge node (total 4TB)
I query svv_table_info to get the size of the each of the tables - they all sum up to 5TB, and I see that most of them look amazingly big
I see that DWH has all up to date data, even though it was "reportedly" full for at least 2 days (it also went over 4TB of size)
I see a running query for 100+ hours, one of Data Analysts left an open notebook. Query did not involve all the tables that seemed unreasonable big
I kill the query, after few moments everything goes back to normal

So: -How could Redshift grow to 5TB if we have only 2x2TB = 4TB of space available!

when the same happened to me I had a bug in my query that produced a cartesian product of a big table to itself, so n square number of rows... and this spills to disk. check your join conditions carefully — AlexYes
You're assuming that the table size in svv_table_info reflects the actual size on disk and that's not always true. It may appear that total table size > 4TB but this is due to the way svv_table_info calculates the table size, which is approximate. You do only have 4TB of disk regardless of what svv_table_info is telling you. The cluster size did not "go from 1TB to 5TB" - the cluster size is always 4TB (in this case). What you are looking at is the percentage used disk space, what you had initially was about 25% used and then with the big query that ran this went up to 100%. — Nathan Griffiths

nicor88 nicor88 · Accepted Answer · 2018-02-20T11:40:28

It happen also to us. Redshift uses disk space when running query, this is why when you kill the query the cluster size get back to normal.

Here a really good article about that https://www.periscopedata.com/blog/disk-based-temporary-tables

Redshift cluster bigger because of running query

EDITED

2 Answers