21
votes

We have 4 ActiveMQ brokers (each running on a separate server) setup in a network of brokers. There about 60 producers. The producers lookup the ActiveMQ connection factory from Glassfish using JDNI.

The ActiveMQ URI configured in Glassfish is as follows:

failover:(tcp://phxgapm01:61616,tcp://phxgapm02:61616,tcp://phxgapm03:61616,tcp://phxgapm04:61616)?randomize=true&backup=false&maxReconnectAttempts=8

Each producer process does a JNDI lookup of the javax.jms.ConnectionFactory and then creates 1 javax.jms.Connection. As the producer runs it will periodically create a javax.jms.Session and javax.jms.MessageProducer, send some messages to a queue and then close the Session and MessageProducer.

That is all background - now to my question. From some, but not all of the producers, we'll see a stream of log output like the following:

2014-12-30 21:07:06,534 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm03:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,538 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,544 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm02:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,548 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,552 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm01:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,556 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm02:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,561 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm02:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,565 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm01:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,568 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm02:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,572 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,577 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm03:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,581 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,586 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm01:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,590 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm03:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,594 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]

For some of the producers we'll see this output every 10 minutes - for others it is less frequent. Even more confusing is that all of these producers use identical code for the JMS messaging - so while producers may vary in the how frequently they create sessions and messageproducers, they all use the same code and all create just 1 connection object.

From reading the documentation, my understanding is that the failover transport would open a connection to 1 of the brokers (selecting randomly in our case). Why do we see this stream of connections (multiple connections to each of the brokers within 60ms)? Using netstat we can see these connections. Is this normal? If not, any suggestions as to what might be causing this?

1
Is the code using straight JMS or JMSTemplate etc. examples are helpful. Is there a PooledConnectionFactory in use.Tim Bish
It is straight JMS - There is no PooledConnectionFactory being used (at least not directly)sceaj
If there's an XA Connection Factory then you need to use the PooledConnectionFactory, otherwise it will disconnect / reconnect all the time. You can see the connection count grow in one of the management topics (cant remember which one)stringy05
if it's not idleTimeout, what about firewall?MarianP
you should turn on debug logging in amqMarianP

1 Answers

1
votes

Without having the activeMQ loglevels raised there is some room to speculation, but:

  • "for others it is less frequent" - Observing different behaviour on different instances in this case is completely natural. If the load is not perfectly distributed between instances, they will behave differently when it comes to messaging. Just imagine one of your nodes picking up 90% of the triggering inputs and doing most of the work alone. That node would be under much higher load and would use its JMS resources totally differently from the rest of the nodes.
  • "my understanding is that the failover transport would open a connection to 1 of the brokers" - That is completely true. Failover will come in play only if the the wrapping connectionFactory requests a new physical connection to be opened. In this case there will be exactly one connection created for that request.
  • "Why do we see this stream of connections" - I am pretty sure that this is due to having a pooling implementation in your project. You can see that there are those connections established to all of your brokers (randomly distributed), indicating multiple requests for new connections at the same time.

By increasing the loglevel in your application you would be able to see exactly which layer initiates this and for what reason. Possible reasons are: "all connections were expired and the pool restores the minIdleConnection count to the minimum"; "some incoming request in your application requires lots of messages to be sent at once, so your pool creates them".