41
votes

I've been learning about Event Hubs and just want to get confirmation or correction on my perspective on Event Hubs? I’m used to leveraging retries, poison messages, at least once delivery and so on for normal enterprise messaging solutions, which Azure Service Bus Queues and Topics give me. It seems that Event Hubs is intended to provide a different tool for very high scale where you have to give up a little of the more “enterprise” features for much higher scale.

Am I thinking about this correctly? Are there additional specifics I need to consider as well? I realize there could be some functional overlap with Event Hubs and Topics, but I'm just looking to get some clarity on how to think of using Event Hubs.

4

4 Answers

44
votes

If you have the choice it's almost always easier to write a system based around a full enterprise pubsub messaging system where you can mark single events as having been consumed, retry messages, and just about every other wonderful feature. If you've already accepted partitioning your message channel (which Azure Service Bus Topics appear to support) then you could in principle scale a more full featured messaging system to the degree you require. The issue is at what cost?

An Azure Service Bus Topic has a cost at high scale of approximately $0.20 per Million messages, Amazon SQS (somewhat similar) lists $0.50 per Million. If you host it yourself you'll likely need to set up a lot of RabbitMQ servers or even multiple clusters as you partition.

Azure Event Hub costs $0.028 per Million plus an amount per throughput unit, same for Amazon Kinesis. Apache Kafka has been benchmarked at 2 Million per second on 3 machines

At say 20,000 events per second sustained the difference between some Azure Topics and Azure Event Hub is in the range of a full time developer's salary. At 2 million per second sustained (which requires contacting MS), the difference is approaching $1M/month.

Basically use the partitioned stream|log / offset tracking systems when you either don't need all the useful features of a full messaging system, or when you don't need them enough to pay the ~10X premium. (Or can't use them because you can't scale the proper messaging system enough without heroic efforts).

36
votes

i wrote a post a while ago about this topic with some support from Dan on the Service Bus Team. Hopefully this should clarify for you

http://microsoftintegration.guru/2015/03/03/azure-event-hubs-vs-azure-messaging/

Service Bus (Messaging)

For messaging it’s about one application telling one or more apps to DO SOMETHING or GIVE ME SOMETHING.

Event Hub (Eventing)

The alternative is that in eventing the applications are saying SOMETHING HAS HAPPENED.

19
votes

Correct!!

The fundamental difference between EventHubs and Topics is - TOPICS offer per-message semantics - whereas, EventHubs - offer Stream Semantics - implies, one should not expect any per-message feature/semantics with EventHubs.

Any middle-tier providing per-message features comes with the processing overhead (the tax)!!

For Ex: Per message Duplicate detection, Receive confirmation per message (topics have a Message.Complete to ack Msg received) - are all Topic features. EventHubs narrows-down the feature set to provide a better low-latency/high-throughput solution.

To visualize features like at-least-once delivery (of per msg is not available in EventHubs) is to translate it to stream semantics - Read until a point in a Given eventHub partition and checkpoint and let your application which is consuming those events handle the at-least-once delivery.

more on Event Hubs...

0
votes

According to this MS Learn article and "Choose between Azure messaging services" on what to choose:

"Azure Event Hubs is designed for high-flow analytics types of events. Azure Service Bus and storage queues are for messages, which can be used for binding the core pieces of any application workflow."

It's important to understand the difference between what events and messages are for because communication services are typically designed to handle their respective objects-- Event Hubs handles events and Service Bus handles messages.

An event triggers a notification that something has occurred. It's "lighter" than a message--it has info about what happened, but doesn't have the data that triggered the event itself. For example, an event may notify you of a file upload and has info about the file, but not the file itself. Events are usually used for broadcast communication/fan-out workflow, i.e. when you have a large number of subscribers for each publisher. The publisher has no expectation on how an event is handled by the consumer.

A message contains the data that triggers the message pipeline. Unlike an event, the publisher has expectations on how the message is handled by the consumer. For example, a publisher sends a message with raw data produced by a service and expects the consumer to store that data and send back a response when done.

(There's also Event Grid, which handles events too, but is different from Event Hubs. Whereas Event Hubs is designed for big data pipeline that involves analytics, Event Grid is designed for event-driven reactive programming.)