0
votes

Time-to-Time redshift users create persistent temporary tables for their computational purpose. which is good, but such short-term tables gets piled up over a period time and don't convey any meaning to our warehouse. Is there any redshift SQL or redshift way of figuring out when the table was last accessed (select, insert, delete, load, unload). based on the timing we can give warning to the users and drop it.

1

1 Answers

0
votes

The STL_QUERY and SQL_QUERYTEXT tables contain information about queries that have been run on Amazon Redshift.

select query, substring(text,1,60)
from stl_querytext
where pid = pg_backend_pid()
order by query desc;

 query |                         substring
-------+--------------------------------------------------------------
 28262 | select query, substring(text,1,80) from stl_querytext where 
 28252 | select query, substring(path,0,80) as path from stl_unload_l
 28248 | copy category from 's3://dw-tickit/manifest/category/1030_ma
 28247 | Count rows in target table
 28245 | unload ('select * from category') to 's3://dw-tickit/manifes
 28240 | select query, substring(text,1,40) from stl_querytext where 
(6 rows)

You could analyze these queries to determine which tables have been recently queried, and then compare them against SVV_TABLEINFO to determine what tables have not been used recently.

You should also encourage users to use CREATE TEMP TABLE, which creates a table that is automatically dropped at the end of the session. You might be able to use permissions to force certain groups of users to only create TEMP tables, or tables in their own schema.