Our goal is to completely eliminate azure search service throttling. Initially, we started off with 3 replicas and 1 partition on the S1 tier. We were getting a lot of throttling sometimes even up to 1.5% of requests were getting throttled. We took some measures to alleviate the problem:
1) - We started load testing the service and came up with a baseline req/sec for 3 replicas. Every time we hit ~37 req/sec our service would get throttling.
2)- We did not want our users to see errors and to alleviate the problem we implemented the exponential backoff transient fault policy that retries the call when Azure Search API returns a 5xx or 408 (Request Timeout) response. That worked well for us.
3) The problem still remained; we still get throttled at 37 req/sec which seems very low to us. This means we are roughly getting a MAX of ~12 req/sec
per replica. So we performance tuned our queries(removed facets, high cardinality field from our index, cleaned up our field properties and make sure the index is doing the bare minimum) our queries got a little faster and not much of an effect on the throttling front.
4) So we decided to go up to FIVE replicas to get rid of throttling. We did load testing again and now the service can handle ~59 req/secs baseline. This again becomes ~12 requests/sec
per replica
~12 requests/sec
per replica seems like LOW capacity for a Standard tier server. This is a huge problem for us as our traffic is only going to increase (not to mention dealing with nasty bot traffic)
Do these benchmark numbers look right to the Azure Search Team?
Or are we doing something wrong? I can provide the search query if needed.
Any help would be much appreciated!
Thanks!