We are investigating an issue on an api on Azure connecting to Azure Redis Cache (tier C2 standard), since yesterday evening to today early morning (nearly 12 hours) we've seen a hundreds of timeouts to redis like this
Timeout performing GET ????????:FV:Providers:Weather, inst: 1, mgr: Inactive, err: never, queue: 318, qu: 2, qs: 316, qc: 0, wr: 1, wq: 1, in: 65536, ar: 0, clientName: Items, serverEndpoint: ?????????:6380, keyHashSlot: 1586, IOCP: (Busy=1,Free=999,Min=8,Max=1000), WORKER: (Busy=66,Free=32701,Min=300,Max=32767
During the night we don´t receive much visits but the error still until today around 9 o´clock, the items in the redis queue were up to 7000 but the traffic to our api was very low during the night.
During the day all was ok except this afternoon during an hour when we got a peak of visitors the problem appeared again. We´ve been looking a lot of metrics, cache read/writes operations is as usual, cache hits, cpu, memory, ... all it´s ok.
Even other API´s use the same redis cache instance and they don´t suffer this issue. For this reason we think that the size of Azure Redis it´s correct if not other API´s will suffer the same problem.
Looking at logs we discovered that just two minutes before the timeout error started we got more than 200 exceptions like this
StackExchange.Redis.RedisConnectionException: UnableToResolvePhysicalConnection on GET at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor
1 processor, ServerEndPoint server) at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor
1 processor, ServerEndPoint server) at StackExchange.Redis.RedisDatabase.StringGet(RedisKey key, CommandFlags flags)
We guess that the two errors are related. But we don´t know if we are doing something wrong or it was an azure problem. May be StackExchange.Redis connection was corrupted after UnableToResolvePhysicalConnection exception and we have to restart API to solve the problem?
Other ideas?
Thanks for your help!