Below is a screenshot of my topologies' Storm UI. This was taken after the topology finished processing 10k messages.
(The topology is configured with 4 workers and uses a KafkaSpout).
The sum of the "process latency" of my bolts is about 8100ms and the complete latency of the topology is a much longer 115881ms.
I'm aware that these sort of discrepancies can occur due to resource contention or something related to Storm internals. I believe resource contention is not an issue here; the GC didn't run at all during this test and profiling shows that I have plenty of available CPU resources.
So I assume the issue is that I am abusing Storm internals in some way. Any suggestions where to look?
Tuples must be waiting somewhere, possibly in the spouts; either waiting to be emitted to the topology or waiting to be acked when they messages have been processed?
Possibly I should adjust the number of ackers (I have set ackers to 4, the same as the number of workers)?
Any other general advice for how I should troubleshoot this?
*Note that the one bolt that has a large discrepancy between it's process and execute latencies implements the ticking bolt, batching pattern. So that discrepancy is expected.
*Edit. I suspect the discrepancy might involve the message being ack-ed by the Spout after being fully processed. If I refresh the Storm UI while it is processing, the ack-ed number for my final Bolt increase very quickly compared to the ack-ed number for the Spouts. Though this may be due to the Spout ack-ing much fewer messages than the final Bolt; a few hundred messages ack-ed by the final bolt may be representative of a single message in the Spout. But, thought I should mention this suspicion to get opinions on if it's a possibility, that the Spout's acker tasks are overflowing.
