I am checking the feasibility of moving from Redshift to BigQuery. I need help in implementing the below use case on BigQuery.
We have a by day product performance table which is a date partitioned table. It is called product_performance_by_day
. There is a row for every product that was sold each day. Every day we process the data at the end of the day and put it in the partition for that day. Then we aggregate this by day performance data over the last 30 days and put it in the table called product_performance_last30days
. This aggregation saves querying time and in the case of BigQuery will save the cost as well since it will scan less data.
Here is how we do it in Redshift currently -
We put the aggregated data in a new table e.g. product_performance_last30days_temp
. Then drop the product_performance_last30days
table and rename product_performance_last30days_temp
to product_performance_last30days
. So there is very minimal downtime for product_performance_last30days table.
How can we do the same thing in the BigQuery?
Currently, BigQuery does not support renaming tables or materialized views or table aliases. And since we want to save the aggregated data in the same table every day we cannot use destination table if the table is not empty.