0
votes

I have a scenario where i have to run parallel inserts/deletes on a snowflake table. For example: the table contains data related to different countries. And each insert pipe or thread will be contain data for only a specific country. Similarly when i am running parallel deletes then each delete thread will be deleting data for only a specific country. I was looking to partition the data in the snowflake table based on country which might have helped in avoiding any locks. however, it seems that option is not there in snowflake.

Can you suggest how can i achieve parallel inserts/deletes and avoid and contention or locks.

Note: I am using matillion to run different ELT jobs in parallel to do the inserts.

2
Have a look here for some ideas.Sergiu

2 Answers

0
votes

In snowflake, there is no option of partition. It's a true SaaS product with almost zero administration.

Coming to your question of the parallel insert and delete, Incase there is a huge delay, either you can scale up or make auto-scale. the Snowflake Data Platform implements a powerful and unique form of partitioning, called micro-partitioning, that delivers all the advantages of static partitioning without the known limitations, as well as providing additional significant benefits.

You can also go for table clustering

0
votes

It is highly unlikely (almost impossible) that you will get locking on your data tables. If you are getting locking on the metadata tables then follow the solution suggested in @sergiu's link above.

If you have performance issues (but not table/record locking) then possible solutions include:

  • Larger warehouse (unlikely to improve performance much but worth trying if all else fails)
  • Auto-scale your warehouse (as suggested in the previous answer) so that more warehouses are running in parallel when there is a high workload
  • Run multiple warehouses: 1 per country or 1 per group of countries. No real benefit over auto-scaling unless there is significantly more data for some countries compared to others and there is a benefit to sizing the warehouses to match the data size