ActiveMQ Artemis produce/consume latency issue

Question

I have been monitoring the end to end latency of my microservice applications. Each service is loosely coupled via an ActiveMQ Artemis queue.

-------------     -------------     -------------
| Service 1 | --> | Service 2 | --> | Service 3 |
-------------     -------------     -------------

Service 1 listens as an HTTP endpoint and produces to a queue 1. Service 2 consumes from queue 1, modifies the message, & produces to queue 2. Service 3 consumes from queue 2. Each service inserts to db a row in a separate table. From there I can also monitor latency. So "end-to-end" is going into "Service 1" and coming out of "Service 3".

Each service processing time remains steady, and most messages have a reasonable e2e latency of a few milliseconds. I produce with a constant rate using JMeter of 400 req/sec, and I can monitor this via Grafana.

Sporadically I notice a dip in this constant rate which can be seen throughout the chain. At first I thought it could be the producer side (Service 1) since the rate suddenly dropped to 370 req/sec and might be attributed to GC or possibly the JMeter HTTP simulator fault, but this does not explain why certain messages e2e latency jumps to ~2-3 sec.

Since it would be hard to reproduce my scenario I checked out this load generator for ActiveMQ Artemis and bumped the versions up to 2.17.0, 5.16.2 & 0.58.0. To match my broker 2.17.0. Which is a cluster of 2 masters/slaves using nfsv4 shared storage.

The below command generated 5,000,000 messages to a single queue q6, with 4 producer/consumer with a max overall produce rate of 400. Messages are persistent. The only code change in the artemis-load-generator was in ConsumerLatencyRecorderTask when elapsedTime > 1sec I would print out the message ID and latency.

java -jar destination-bench.jar --persistent --bytes 1000 --protocol artemis --url tcp://localhost:61616?producerMaxRate=100 --out /tmp/test1.txt --name q6 --iterations 5000000 --runs 1 --warmup 20000 --forks 4 --destinations 1

From this I noticed that there were outlier messages with produce/consume latency nearing 2 secs. Most (90.00%) were below 3358.72 microseconds. I am not sure why and how this happens? Is this reasonable ?

EDIT/UPDATE I have run the test a few times this an output of a shorter run.

java -jar destination-bench.jar --persistent --bytes 1000 --protocol artemis --url tcp://localhost:61616?producerMaxRate=100 --out ~/test-perf1.txt --name q6 --iterations 400000 --runs 1 --warmup 20000 --forks 4 --destinations 1

The result is below

 RUN 1   EndToEnd Throughput: 398 ops/sec
**************
EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
mean              10117.30
min                 954.37
50.00%             1695.74
90.00%             2637.82
99.00%           177209.34
99.90%           847249.41
99.99%           859832.32
max             5939134.46
count              1600000

The JVM Threads Statusis what I am noticing in my actual system on the broker a lot of time_waiting threads and were there are spike push-to-queue latency seems to increase.

Currently my data is as i said hosted on ntfs v4 as shown . I read Artemis persistence section that

If the journal is on a volume which is shared with other processes which might be writing other files (e.g. bindings journal, database, or transaction coordinator) then the disk head may well be moving rapidly between these files as it writes them, thus drastically reducing performance.

Should I move the bindings folder outside ntfs on the vms disk? Will this improve performance ? It is unclear to me.

How does this affect Shared Store HA?

Service 1 listens as an http endpoint and produces to a queue 1. Service 2 consumes from queue 1,modifies the message & produces to queue 2. Service 3 consumes from queue 2. Each service inserts to db a row in a separate table. From there I can also monitor latency. So "end-to-end" is going into "Service 1" and coming out of "Service 3". — dim5b
Regarding your second point I used the load generator with a single instance. The result is the same as I mention i see sporadic drop in throughput but more troubling is the high increase in latency. — dim5b
Thanks for you time an effort I have updated my question. I tried to troubleshoot monitoring threads and disk. In my real case scenarion there are multiple queues (10) on the broker when there is load there seems to be an increase of latency on pushing to queue from my applications — dim5b

Justin Bertram Justin Bertram · Accepted Answer · 2021-05-07T17:51:22

I started a fresh, default instance of ActiveMQ Artemis 2.17.0, cloned and built the artemis-load-generator (with a modification to alert immediately on messages that take > 1 second to process), and then ran the same command you ran. I let the test run for about an hour on my local machine, but I didn't let it finish because it was going to take over 3 hours (5 million messages at 400 messages per second). Out of roughly 1 million messages I saw only 1 "outlier" - certainly nothing close to the 10% you're seeing. It's worth noting that I was still using my computer for my normal development work during this time.

At this point I have to attribute this to some kind of environmental issue, e.g.:

Garbage Collection
Low performance disk
Network latency
Insufficient CPU, RAM, etc.

ActiveMQ Artemis produce/consume latency issue

1 Answers