In my topology I see around 1 - 2 ms latency when transferring tuples from spouts to bolts or from bolts to bolts. I am calculating latency using nanosecond timestamps because the whole topology runs inside a single worker. Topology is run in a cluster which runs in a production capable hardware.
To my understanding, tuples need not be serialized/de-serialized in this case as everything is inside single JVM. I have set parallelism hint for most spouts and bolts to 5 and spouts only produce events at a rate of 100 per second. I dont think high latency is due to queuing of events because I dont see any increase of latency with time. No memory increase either. log levels are set to ERROR. CPU usage is in the range of 200 to 300 %.
what could be causing this latency? I was expecting only few us's for tuple transfer.