0
votes

We include dates in the events sent to our hub. Whenever I connect a new Azure Function to our Event Hub with a new consumer group, it seems to receive all events ever sent to the hub. This is somewhat expected, however I set the Message Retention on the hub to 1 day, so I expected at most to receive one day worth of events for the new consumer, but it seems to receive all events, even months old events, based on the date within the message, and lots more events than we generate over a day.

Based on this page:

https://blogs.msdn.microsoft.com/servicebus/2015/03/09/data-retention-in-event-hubs/

It seems like maybe this retention period is somewhat irrelevant, or misleading. If the "container" hasn't filled up yet, it could contain messages forever. If, for example, the container has a limit of 1000 messages before the event hub looks at it, but it takes a year to generate 1000 messages, does that mean any new consumer could get year-old messages, even with a 1-day "retention period"?

When the container does hit the limit of 1000 messages, are the messages older than 1 day discarded and the messages newer than 1 day ago (within the retention period) retained? Or is the whole container discarded?

From looking at our test and prod environments it seems like this container fits at least 50000 messages (or equivalent size).

Is a checkpoint the only way to limit this initial influx of messages for a new consumer group?

1

1 Answers

2
votes

Retention time is the minimum guaranteed period, not the maximum or exact. 1 day retention means you will have all the messages from last day, but maybe some more messages too.

So you can rely on 1 day of retention, but be prepared to see older messages too.