Our solution use Azure IoT hub to connect thousands of devices to our backend. Devices report operating state, and based on this state we control the devices. The challenge is that devices are installed in groups, and state changes from a device, affect other devices in the group. Because of this we need to handle messages for a group in sequence. I would have preferred to use the devices groupId to partition messages. IoT hub and its default endpoint use deviceId to partition messages, and we've had to find other means of synchronising messages across partitions. Up until now, we've used a semaphore for this, which have worked fine as we're running everything in one process.
As we're nearing the point where a single App Service plan can no longer handle all messages, we need to scale the solution out. Thus, the semaphore will no longer suffice, and we need to find an alternative to distribute and synchronise messages.
The plan is to use custom routing in IoT hub, forwarding messages to one or more event hub endpoints. Currently I see two options:
If it's possible to affect partition key when using custom routes/endpoints, we could assign each device a groupId, and use that for partitioning messages. This would route all messages for a group to the same processor without any additional synchronisation, allowing us to simply scale out the event processor to handle more messages. Sadly, I've not found a way to affect the partition key of messages when using custom routes/endpoints either, and it does not look like this is a viable solution.
Add multiple custom event hub endpoints to IoT hub, and use groupId to route messages to endpoints. This will require us to deploy multiple event processor instances, each configured to consume messages for "its" event hub. Since event hubs have a minimum of 2 partitions, we would still need to use a semaphore to synchronise messages destined for the event processor. This seems like the only viable option, but adds quite a bit of complexity to scaling as we would have to manually deploy and configure each processor instance, instead of simply scaling out the App Service plan and using partitions to distribute messages.
Are there ways to change partition key when using custom routes/endpoint, allowing us to implement solution 1., or are there other better ways to achieve this?