2
votes

I am currently writing a seminar thesis on the Snowflake Cloud Data Warehouse and was wondering how cache consistency is handled within a Virtual Warehouse (VW).

I was unable to find an answer to this question in the original paper (see section 3.2.2 for Local Caching and File Stealing). I have further tried finding more information on this topic in the official documentation, especially Warehouses Considerations:

This cache is dropped when the warehouse is suspended, which may result in slower initial performance for some queries after the warehouse is resumed. As the resumed warehouse runs and processes more queries, the cache is rebuilt, and queries that are able to take advantage of the cache will experience improved performance.

While this gives information on the lifetime of the local cache, it is still unclear to me how consistency is handled when the VW is not suspended and cached data is updated by another VW. Do the worker nodes containing the local cache check if the data is up-to-date for each query? Or is it handled differently?

I am thankful for any additional information on this topic. Thank you.

1

1 Answers

0
votes

Snowflake checks the "freshness" of the data in the cache for every query and re-queries the base data if the cache data is not current: Snowflake Cache