3
votes

I just did a little stress test on Azure Notification Hub.

Sent 200 exactly the same messages to iPhone: There are 62 "The Push Notification System returned an Internal Server Error"

And 138 "The Notification was successfully sent to the Push Notification System"

So the failure rate is 31%!!!

I turned on 'enableTestSend' mode and the message is got from NotificationOutcome->RegistrationResult->Outcome

Does anyone also have done some tests on it?

This is definitely not acceptable.

1
The Internal Server Error could be caused by a range of backend problems that may be outside of Azure's control (for example: APNS may be having problems). If you continue to see this problem then Microsoft's advice is to report the problem because you can't debug these issues yourself.Simon W
I don't care Windows Phone at all. I just did another test run without 'enableTestSend' and it's the same failure rate ~30% not received by my iPhone...I'm trying Amazon SNS and other services...Sheep
Just finished a test using Amazon SNS. it's 100% deliver rate out of 200 messages and I tried to use Azure again and it's still around 30% miss. It might be not fair since for Amazon SNS I use endpoint to publish to a single device while Azure only supports tags. But that's what I need.Sheep
Test send operation itself it throttled by Notification Hub, limit is 100 per minute per namespace, but it should be another error. Also PNS (APNS, ADM, etc) 'does not like' high send rate for SAME device, but again I would expect another error. So could you provide namespace name and some tracking ids for failed messages, I can take a look at logs. In general NH send capacity is much higher then 200 messages in row :)efimovandr

1 Answers

0
votes

Depending on how your load test was designed, it might or might have not been Notification Hubs-related failure. Either as @efimovandr noted, those could have been throttled by APNS or as @simon-w suggested, could be any other PNS-specific issue. One way to verify that is to run exactly the same code that you ran using NH, but instead call the PNS directly. Changes are you'll get the same success rate. Then it means you need to design the test in a different way that better reflects real-world service usage.

Microsoft offers SLA on Notification Hubs service which means that they invest in making sure that failure rates are manageable for customers.

If you are still experiencing the problem, contact customer support with your namespace name and approximate time (with time zone) when it happened and they will help you understand what was going on.