1
votes

We decide to re-architect our product storage with snowflake to store our timeseries data. Currently we have 10 TB timeseries data(which increase daily) and 6 jobs which triggers after every 15 min interval and utilize almost 40GB/day data for processing. We are using Microsoft Azure cloud.

Since we are not getting exact size of node/server/cluster of snowflake, can you please suggest us what should be the warehouse size we should choose.

1
One of the key features of Snowflake is separation of compute and storage, the amount of data is not directly relevant to warehouse sizing. What you are doing to the data is much more relevant. Choosing a correct warehouse size is kind of trial and error. Bigger is faster, and proportionately more expensive, so see how long it takes on a given size and adjust appropriately.David Garrison

1 Answers

1
votes

The warehouse size can be modified any time. So you can start with a small one, and increase the size later until you find the optimum size. You can also use multiple warehouses, so you can redistribute your workload.

I suggest you to focus on clustering keys instead of the warehouse size, because selecting correct clustering key for effective data pruning will be very important in your case:

https://docs.snowflake.net/manuals/user-guide/tables-clustering-keys.html#benefits-of-defining-clustering-keys-for-very-large-tables