0
votes

I am trying to send huge amount of Data to an event hub. I am creating Batches of EventData using EventDataBatch. And sending it to the event hub. I was initially sending the batches using one EventhubClient . Later I created var eventHubClientPool = new EventHubClient[MaxConnections];
And now I am sending each batch as eventHubClientPool[connectionId].SendAsync(ehBatch.ToEnumerable())); Where connectionId = random.Next(MaxConnections);

How do I further increase the throughput?

2

2 Answers

0
votes

According to "Best practices" Docs

"A single batch must not exceed the 1 MB limit of an event. Additionally, each message in the batch uses the same publisher identity. It is the responsibility of the sender to ensure that the batch does not exceed the maximum event size. If it does, a client Send error is generated"

If you still want to improve throughput,

  • Consider minifying/compressing payload.

  • Store your payload in an external store such as Cosmosdb and send a reference to your payload in the event.

0
votes

You'll have to increase the throughput units (TU), one TU allows:

  • Ingress (send events to event hubs) up to 1000 events/sec or 1MB/sec, whichever comes first
  • Egress (consume events from event hubs) up to 4096 events/sec or 2MB/sec

How to change throughput units:

  • As Sajeetharan mentioned, you could do that via by setting up auto-inflate. Note this feature is only available if you have Standard tier.
    • Go to Azure Portal Event Hubs namespace Overview tab, then find auto-inflate throughput units
  • You could also set the TUs during namespace creation, up to 20 for both Basic and Standard tier.
  • If you've already created the namespace, you could change the TU by going to the Azure Portal Event Hubs namespace -> Settings -> Scale

TUs apply to all event hubs in an Event Hubs namespace. You buy TUs at the namespace level, and they are shared across all event hubs in a namespace.

There's also Event Hubs Dedicated cluster which is designed for customers with the most demanding requirements. This offering builds a capacity-based cluster that's not bounded by TUs, but by CPU and memory usage of the cluster.