0
votes

I have a Snowflake table that has 50 billion rows (~10 TB) and ~40 columns. This table has a timestamp column that's always filled. I need to create a new date column and backfill all values for it.

This query works, but it's really inefficient (it takes 1 hour to scan 1%, with 2XL warehouse). Is there a more efficient way to do this?

UPDATE DAILY_TABLE b
SET b.ts_date = to_date(a.ts_tstamp)
FROM
    (SELECT a.ts_date, a.ts_tstamp 
     FROM DAILY_TABLE a) a;

Current table:

ts_tstamp ts_date
2021-04-28 05:01:32.883 null
2021-01-28 05:01:32.883 null
2020-01-25 05:01:32.883 null

Desired table:

ts_tstamp ts_date
2021-04-28 05:01:32.883 2021-04-28
2021-01-28 05:01:32.883 2021-01-28
2020-01-25 05:01:32.883 2020-01-25
1

1 Answers

0
votes

If the column does not need to be materialized and is always dependent on ts_tstamp then another option could be a virtual column:

ALTER TABLE DAILY_TABLE
ADD COLUMN ts_date_virtual DATE AS TO_DATE(ts_tstamp);