2
votes

I am working on the POC for Azure Event hubs to implement the same into our application.

Quick Brief on flow.

  • Created tool to read the CSV data from local folder and send it to event hub.
  • We are sending Event Data in Batch to event hub.
  • With 12 instance of tool (Parallel), I can send a total of 600 000 lines of messages to Event hub within 1 min.
  • But, On receiver side, to receive the 600 000 lines of data, it takes more than 10 mins.

Need to achieve

  • I would like to Match/double my egress speed on the receiver to process the data. Existing Configuration

The configuration I have made user are

  • TU - 10 One Event hub with 32 Partition.

  • Coding logic goes same as mentioned in MSDN

  • Only difference is, I am sending line of data in a batch.
    EventProcessorhost with options {MaxBatchSize= 1000000,
    PrefetchCount=1000000

1

1 Answers

2
votes

To achieve higher egress rate (aka faster processing pipeline) in eventhubs:

  1. Create a Scaled-out pipeline - each partition in EventHub is the unit-of-scale for processing events out of EventHub. With the Scale you described (6Lakh events per min --> 10K events per sec - with 32 partitions - you already got this right). Make sure you create as many partitions as you envision your pipeline need in near future. Imagine analyzing traffic on a Highway and no. of lanes is the only limitation for the amount of traffic.

  2. Equal load distribution across partitions: if you are using SendToASpecificPartition or SendUsingPartitionKey - you will need to take care of equal load distribution. If you use EventHubClient.Send(EventDataWithOutPartitionKey) - EventHubs service will make sure all of your partitions are equally loaded. If a single EventHub Partition is heavily loaded - the amount of time you can process all events on EventHub will be bound by no. of events on this Partition.

  3. Scale-out physical resources on the Receiver/EventProcessorHost: most importantly Network (Sockets & bandwidth) & after-a-point, CPU & Memory. Use PartitionManagerOptions.MaxReceiveClients to increase the maximum number of EventHubClients (which has a dedicated MessagingFactory, which maps to 1 socket) created per EventProcessorHost instance. By default it is 16.

Let me know how it went... :)

More on Event Hubs.