activeMQ and labkey Ben Bimber  2010-12-17 05:27
Status: Closed
 
we have a production server running activeMQ. we have a client machine that executes certain pipeline jobs. that client is

i tried to start a pipeline job in labkey, and the job had a status of 'waiting' for a long time. after a day, i restarted the pipeline java process on the client machine, and immediately this pipeline job started.

In the log of the client machine, I see this:


INFO 2010-12-14 14:01:30,888 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopping: ActiveMqJmsConnector{this=140b8fd, started=true, initialised=true, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:01:30,890 [UMOManager.1] org.mule.providers.jms.JmsMessageDispatcher: Disconnected: JmsMessageDispatcher{this=20dcb7, endpoint=jms://status.queue}
INFO 2010-12-14 14:01:30,891 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopped: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:01:30,892 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Disconnected: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:01:30,892 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopped: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:01:30,896 [UMOManager.5] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Connected: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:01:30,992 [UMOManager.6] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Starting: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
ERROR 2010-12-14 14:01:45,993 [UMOManager.6] org.mule.providers.SimpleRetryConnectionStrategy: Failed to connect/reconnect: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}. Root Exception was: Failed to start Jms Connection. Type: class org.mule.umo.lifecycle.LifecycleException
org.mule.umo.lifecycle.LifecycleException: Failed to start Jms Connection
    at org.mule.providers.jms.JmsConnector.doStart(JmsConnector.java:504)
    at org.mule.providers.AbstractConnector.startConnector(AbstractConnector.java:367)
    at org.mule.providers.AbstractConnector.connect(AbstractConnector.java:1186)
    at org.mule.providers.SimpleRetryConnectionStrategy.doConnect(SimpleRetryConnectionStrategy.java:76)
    at org.mule.providers.AbstractConnectionStrategy$1.run(AbstractConnectionStrategy.java:57)
    at org.mule.impl.work.WorkerContext.run(WorkerContext.java:310)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1061)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575)
    at java.lang.Thread.run(Thread.java:619)
INFO 2010-12-14 14:01:45,993 [UMOManager.6] org.mule.providers.SimpleRetryConnectionStrategy: Waiting for 5000ms before reconnecting. Failed attempt 1 of unlimited
INFO 2010-12-14 14:11:00,817 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopping: ActiveMqJmsConnector{this=fb6763, started=true, initialised=true, name='jmsConnectorTask', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:11:16,060 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopped: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:11:16,060 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Disconnected: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:11:16,060 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopped: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:11:16,064 [UMOManager.7] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Connected: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:11:16,161 [UMOManager.8] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Starting: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:11:20,713 [UMOManager.8] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Started: ActiveMqJmsConnector{this=140b8fd, started=true, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:22:55,355 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopping: ActiveMqJmsConnector{this=140b8fd, started=true, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:22:55,356 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopped: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
ERROR 2010-12-14 14:22:55,356 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Failed to disconnect: Initialisation Failure: The transport is not running.
org.mule.providers.ConnectException: Initialisation Failure: The transport is not running.
    at org.mule.providers.jms.activemq.ActiveMqJmsConnector.doDisconnect(ActiveMqJmsConnector.java:99)
    at org.mule.providers.AbstractConnector.disconnect(AbstractConnector.java:1215)
    at org.mule.providers.AbstractConnector.stopConnector(AbstractConnector.java:437)
    at org.mule.providers.jms.JmsConnector$1.onException(JmsConnector.java:285)
    at org.apache.activemq.ActiveMQConnection$4.run(ActiveMQConnection.java:1524)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1061)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575)
    at java.lang.Thread.run(Thread.java:619)
Caused by: javax.jms.JMSException: The transport is not running.
    at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:58)
    at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1213)
    at org.apache.activemq.ActiveMQConnection.close(ActiveMQConnection.java:579)
    at org.mule.providers.jms.activemq.ActiveMqJmsConnector.doDisconnect(ActiveMqJmsConnector.java:94)
    ... 7 more
Caused by: java.io.IOException: The transport is not running.
    at org.apache.activemq.transport.TransportSupport.checkStarted(TransportSupport.java:103)
    at org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:117)
    at org.apache.activemq.transport.InactivityMonitor.oneway(InactivityMonitor.java:141)
    at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:80)
    at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:93)
    at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:47)
    at org.apache.activemq.transport.ResponseCorrelator.asyncRequest(ResponseCorrelator.java:69)
    at org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:79)
    at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1203)
    ... 9 more
INFO 2010-12-14 14:22:55,357 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopped: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:22:55,362 [UMOManager.9] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Connected: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:22:55,457 [UMOManager.10] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Starting: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:23:01,062 [UMOManager.10] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Started: ActiveMqJmsConnector{this=140b8fd, started=true, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:26:57,737 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopping: ActiveMqJmsConnector{this=140b8fd, started=true, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:26:57,738 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopped: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:26:57,738 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Disconnected: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:26:57,738 [AcitveMQ Connection Worker: tcp://xnight.primate.wisc.edu/192.168.1.15:61616] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Stopped: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:26:57,740 [UMOManager.11] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Connected: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=false, supportedProtocols=[jms], serviceOverrides=null}
INFO 2010-12-14 14:26:57,839 [UMOManager.12] org.mule.providers.jms.activemq.ActiveMqJmsConnector: Starting: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}
ERROR 2010-12-14 14:27:12,839 [UMOManager.12] org.mule.providers.SimpleRetryConnectionStrategy: Failed to connect/reconnect: ActiveMqJmsConnector{this=140b8fd, started=false, initialised=false, name='jmsConnectorStatus', disposed=false, numberOfConcurrentTransactedReceivers=4, createMultipleTransactedReceivers=true, connected=true, supportedProtocols=[jms], serviceOverrides=null}. Root Exception was: Failed to start Jms Connection. Type: class org.mule.umo.lifecycle.LifecycleException
org.mule.umo.lifecycle.LifecycleException: Failed to start Jms Connection
    at org.mule.providers.jms.JmsConnector.doStart(JmsConnector.java:504)
    at org.mule.providers.AbstractConnector.startConnector(AbstractConnector.java:367)
    at org.mule.providers.AbstractConnector.connect(AbstractConnector.java:1186)
    at org.mule.providers.SimpleRetryConnectionStrategy.doConnect(SimpleRetryConnectionStrategy.java:76)
    at org.mule.providers.AbstractConnectionStrategy$1.run(AbstractConnectionStrategy.java:57)
    at org.mule.impl.work.WorkerContext.run(WorkerContext.java:310)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1061)
    at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575)
    at java.lang.Thread.run(Thread.java:619)
INFO 2010-12-14 14:27:12,840 [UMOManager.12] org.mule.providers.SimpleRetryConnectionStrategy: Waiting for 5000ms before reconnecting. Failed attempt 1 of unlimited



That block represents the entries on 12-14. After this last one it stopped retrying. On 12-12 the log shows a successful pipeline job (plus many unsuccessful, but that wasnt activeMQ's fault) and on 12-17 I restarted the java process on the client and it successfully started the job. There's no entries between 12-14 and 12-17.

The logs seems to suggest the activeMQ connector is shutting down? Is that just part of trying to reconnect? Does this log suggest anything else I should be testing?

For what it's worth, the client machine runs monit, which periodically checks to see if the java process is running (it was) and it also tests port 61616 on xnight. Neither of these threw errors during this time period. Thanks for any help.
 
 
Brian Connolly responded:  2010-12-17 06:40
Ben,

I am not of exactly how the code is written but I know that when the LabKey server or the LabKey remote pipeline server (ie what you have installed on gems) loses connectivity to the activeMQ server they do something like

1) Try to connect to the server for x number of times
2) wait 1 seconds try again
3) wait 5 seconds try again
4) wait 10 seconds try again
....

this will go on until eventually either the server stops attempting to reconnect or the time between reconnect attempts is large.

I believe frequency of the errors you are seeing from activeMQ will probably go down when you are able to clear up the perf problems on xnight. In the mean-time, I would recommend that if you have to restart the activeMQ process on xnight or reboot xnight, then restart the LabKey remote pipeline server on gems afterwards.

About your monit monitoring, please note that monit is simply checking if the activeMQ server is listening on the tcp port. This check is not sufficient to determine if activeMQ is fully operational (ie tasks will run on the pipeline). The current monitoring is designed to simply check if the activeMQ server has been started or not. To determine if the server is running and fully operational (ie you are able to put tasks on the queue and take them off, etc) then you will need add monitoring with tools like http://activemq.apache.org/how-can-i-monitor-activemq.html or http://it.toolbox.com/blogs/unix-sysadmin/monitoring-activemq-from-nagios-27743