0
votes

We have been facing a problem in production of late. There is a DW2500c instance of Azure synapse analytics(formerly DWH). Everything has been running fine until recently a team deployed a webapp for business to access data from. The data is pulled from the DWH. What we noticed is that there have been a sudden surge of queries and sessions which caused a concurrency crunch. I've attached the stats DWH stats From the documentation, I can see the max sessions allowed is 1024. This indicates that it wasn't the session count that was a problem but the number of concurrent queries(300+) and the queries probably went into an internal queue waiting for resources to be freed up. I was doing some research on workload management and was thinking we could give it a try. We still need to check what is the current assignments for the workload groups and see if we need to increase the concurrency or increase the memory. We tried increasing the tier from 2000 to 2500 but that did not help. But i wanted to get a general opinion, what is the best way to avoid this kind of scenario?

1

1 Answers

3
votes

You should take a close look at the resource class the webapp is accessing the database with. ie is it a static class like staticrc80 which allocate the same amount of memory irrespective of the current DWU, or a dynamic resource class like largerc which allocates a dynamic amount of memory depending on DWU. If the webapp develpers have not specified any resource class explicitly, then it is most likely running at smallrc.

Maybe the webapp designers thought their app was more important than anything else and have assigned themselves a greedy resource class. Either way, this will be instructive. You will then need to have a discussion with the architects responsible for the webapp, Synapse and the DBAs that look after your Synapse about capacity planning.

This problem also should have become apparent in load test. It's easy to test multiple users with webapps these days, eg Azure DevOps load testing, Selenium etc Please ask the webapp developers for the results of their load test.

As an alternative, there are a few things you can do:

  • Try the new result set caching feature in Synapse which caches query results when enabled. Queries running against the cache do not count against your concurrency limit. This kind of relies on lots of similar queries being run however, but this feature may reduce your issue and boost performance.
  • As SQL Data Warehouse and now Synapse are not famous for massive concurrency, alternate patterns can be used, eg hub and spoke, where you dump out certain tables into ordinary Azure SQL Databases (which do not have the same concurrency issues) and maybe even pause your Synapse (your hub). Get your webapp users to connect to the SQL DB (the spoke).
  • Another new and intriguing feature of Synapse is SQL on-demand. This would allow a variation of hub and spoke where you could dump out tables to Azure Data Lake using CREATE EXTERNAL TABLE and then get your webapp users to connect to the SQL on-demand endpoint instead of the Synapse one. In theory, it would just be a connection string change for them and would solve your concurrency problem. You can't really perf tune the queries and the SQL on-demand T-SQL coverage is a bit more limited but it's certainly an interesting pattern and I'm looking at it right now.
  • another tried and tested alternative would be to put Azure Analysis Services (AAS) or Power BI in front of your Synapse database to offload the work.

HTH