Feb 9th, 2010, 11:10 AM
I am developing MDPs which consume messages from a remote queue. The MDPs need to be up and running all the time. I have DefaultMessageListenerContainer as the container with 8 concurrent consumers. The MDPs run without problems.
I intend to monitor the MDPs via a unix cron job (calls JMX programs to query statuses of components), so it can alert application support team on serious errors like application server is down, web-app not running etc. However, I cannot monitor the MDPs themselves reliably.
For example, if the MDPs cannot consumer a message and retries for a long time, how do I get to know there is a problem?
I am aware of ExceptionListeners, but they do not cover error which are temporary (like connection failures due to busy network/queues). i.e If the MDPs have recovered, there is no way of knowing it.
Is there any way I can check the MDPs so I can know if the MDP has recovered after an error?
I have searched everywhere on the web but cant seem to get a lead.
Your inputs are welcome.
Thanks in advance.
Feb 9th, 2010, 12:40 PM
There are a couple of ways we've handled this in the past...and it's mostly an operations issue, not a programming issue.
If you have redelivery setup for your listeners/queues, your application should intelligently try to redeliver over some period until it finally gives up and puts a message in the DLQ.
You can use a tool like splunk to monitor logs for errors and you can monitor both your queues and the DLQ outside your application.
We'd usually monitor the queues and alert if they stayed above some threshold for a period of time (e.g. over 100 messages in queue for more than 15 minutes). We'd also monitor the DLQ for ANY messages since if a message makes it to the DLQ that means that the app has terminated it's attempts to redeliver the message and finally gave up.
This type of monitoring is best done with an enterprise tool outside your application (like Hyperic or Wily).
Tags for this Thread