Our datawarehouse is in Redshift (50TB size). Sometimes business users run big queries (too many joins, inline queries - generated by BI tools such as Tableau). Big queries slow database performance.
It is wise to use Spark on top of Redshift to offload some of the computation outside Redshift?
Or will it be easier and cost effective to increase Redshift computation power by adding more nodes?
If I execute
select a.col1, b.col2 from table1 a, table2 b where a.key = b.key
in Spark. Tables are connected via JDBC and resides on Redshift, where does actual processing happen (in Spark or Redshift)?