1
votes

https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare.html#general-file-sizing-recommendations The number of load operations that run in parallel cannot exceed the number of data files to be loaded. To optimize the number of parallel operations for a load, we recommend aiming to produce data files roughly 100-250 MB (or larger) in size compressed.

I got above details from Snowflake doc, they simply said (or larger) can someone explain what is the max size recommended.

1

1 Answers

0
votes

It's a consideration between aggregating smaller files (thus reducing overhead) and splitting larger files into smaller ones (thus distributing the workload and increasing parallelism).

The general size recommendatation to meet the above consideration is 100-250MB. That is what's in the docs. The term "or larger" just means, your best file size in your individual scenario can also be above 250MB, e.g. 300MB, depending on your consideration-results.