I have a ultra wide table with 100s of column in redshift. This table is more like a matrix. One row per user with 1 or 0 in each column depending on certain features of the user. the goal is obtain best response time on select SQL from this table. We are expecting about 160 M total rows in the table and the select will return about 30M rows. Select is usually based in one of the columns being 1.
the table looks like user_id, col1, col2.....col100
for example:
select * from table 1 where col10=1;
Then with this result set the UI will figure intersecting users between multiple columns. Currently the performance is very bad. What distkey or sortkey options that will improve the query performance?