I'm working on a server side project providing a request/response service over TIBCO EMS and am looking for advice on best practice to archive scalability as well as low latency in this service. I'm doing this on .NET, but as TIBCO EMS is supposedly implementing the JMS specification, I assume that advice for other JMS implementations as well as platforms (Java) would be relevant.
Currently, we are using one Connection, one Session, one Consumer and listening to messages using a callback on that single Consumer. Each request is processed on the callback thread synchronously replies on a different Queue (but same Session). This works, but does not appear to scale - the CPU load is negligible even at high transaction rates, but the latency for request keeps growing.
I assume what is happening is that EMS uses a single thread for the callback, and the processing time as well as the time required to send the reply therefore blocks other requests from being processed, but - what is the best way of getting this to scale?
One way would be to immediately schedule the actual processing of an incoming request on the thread pool once received. This is a quick fix and would scale, but would introduce additional latency and would introduce threading concerns around use of the session. Another one would be to have a number of Session objects, or even Connection objects? Can anyone please advice on best practice for doing this, I imagine it must be one of the more common usage patterns out there...