Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: Message loss after a netweork disconnect

  1. #1
    Join Date
    Mar 2012
    Posts
    11

    Default Message loss after a netweork disconnect

    Hi,

    If I disconnect my consumer server from the RabbitMQ server (simply unplug the network cable) and then reconnect again, the first message sent after this event is lost but all subsequent messages are received perfect by the consumer. The queue consumers are configured with Auto-Ack property. After sending the first message to the queue, the queue immediately looks empty but it is never received by the consumer.

    Am I missing a configuration setting or is it a bug?
    I am using RabbitMQ 2.8.4 and spring-amqp 1.1.2

    Thanks for your help.

  2. #2
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,016

    Default

    Pulling cables is never a good idea, unless you have heartbeats enabled - so both ends get to know the connection was broken.

    That said, I wouldn't expect a complete loss of a message but it's certainly possible to be left with an un-acked message in a queue after a such a failure.
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  3. #3
    Join Date
    Mar 2012
    Posts
    11

    Default

    Hi Gary,
    No message is left in queue after this. This problem is consistently reproducible.

    Are you recommending to set heartbeats?
    You said "Pulling cables is never a good idea". What is the best way to test network interruption?

    Thanks,

  4. #4
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,016

    Default

    All I am saying is, the way TCP works is, if you pull an ethernet cable, depending on the network topology, one or the other end might not know that the socket was disconnected.

    The underlying rabbit connection factory supports heartbeats which can be used to detect this condition.

    It appears that the new RabbitMQ 3.0.0 release enables heartbeats by default...

    https://www.rabbitmq.com/release-notes/README-3.0.0.txt

    As I said, I would not expect messages to be lost; but unack'd messages could exist; they wouldn't be in the queue, but the broker won't resubmit it until it finds out the connection was lost.

    If you can reproduce it with a simple example, and/or provide a log, I will take a look.
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  5. #5
    Join Date
    Mar 2012
    Posts
    11

    Default

    The following simple example can reproduce the problem.

    Code:
    <rabbit:connection-factory id="connectionFactory" host="myhost" port="5672" channel-cache-size="10" />
    <rabbit:admin connection-factory="connectionFactory" />
    <rabbit:queue id="jobExecQ" name="Q_JobExec" queue-arguments="haQ" />
    <rabbit:queue-arguments id="haQ">
       	<entry key="x-ha-policy" value="all" />
    </rabbit:queue-arguments>
       	
    <rabbit:listener-container connection-factory="connectionFactory" concurrency="5" acknowledge="none" >  
         <rabbit:listener ref="jobExecutor" queues="jobExecQ" />
    </rabbit:listener-container>
    <bean id="jobExecutor" class="test.amqp.JobExecutor" />
    And here is the MessageListener code:

    Code:
    public class JobExecutor implements MessageListener {
    
      private static Log log = LogFactory.getLog(JobExecutor.class);
    
      public void onMessage(Message rmqMessage)  {
        String messageText = new String(rmqMessage.getBody());
        log.info("Received message: " + messageText );
      }
    
      public static void main(String[] args) throws Exception {
        ApplicationContext context = new ClassPathXmlApplicationContext("rabbitConfiguration.xml");
      }
    }
    To reproduce:
    1. start the app by the main method in the above class
    2. send a message to Q_JobExec queue using the default exchange (I used rabbitmq management console for this)
    3. check the log to see the received message
    4. unplug the network cable for a few seconds
    5. send second message as you did in step 2
    6. send third/forth/... messages

    The message in step 5 is lost.

    Thanks for your help.
    Last edited by rasadoll; Nov 22nd, 2012 at 09:48 PM.

  6. #6
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,016

    Default

    Please use [ code ] ... [ /code ] tags (no spaces inside brackets) around code and config.

    Did you pull the cable on the consumer, or the server?
    Is there a network switch between the consumer and the server? (Likely there is, because the server won't find out the connection was broken until it tries to send the next message).
    Do you see anything interesting in the logs on the consumer (with TRACE level) regarding the "lost" message after the connection is re-established?
    What about in the Rabbit log on the server?
    If you see no evidence of the "lost" message arriving in the consumer's logs, I suggest you raise this question on the RabbitMQ mailing list.
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  7. #7
    Join Date
    Mar 2012
    Posts
    11

    Default

    Hi Gary,
    The cable is pulled on the consumer side.
    I am not aware of our network architecture in the company. However, the consumer is on my machine and the server is on a VM.
    Nothing interesting in the RabbitMQ log. And I don't think that it is a rabbitmq issue. Before switching to spring-amqp we were using rabbitmq client api and had implemented an HA client to survive connection loss/etc. In that implementation we don't see this problem and all messages are received after reconnect.
    I have also found that setting heartbeats fixes this issue. Is this the right/best solution?

    Thanks

  8. #8
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,016

    Default

    Interesting - I can't reproduce your problem; my client is running on linux - I don't have a native windows box I can use. What is your client's OS?

    However, I am running with RabbitMQ 3.0.0 (which enables heartbeats by default according to the release notes).

    What's weird is (with Linux), I see no connection break - when I pull the cable the connection remains 'established' (with a non-zero send-Q) and when I reconnect everything works fine.

    I have to go out for turkey day now, but I'll see if I can reproduce by disabling heartbeats over the holidays.

    I assume you do realize that auto-ack (none in s-a parlance) is dangerous, though; right?
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  9. #9
    Join Date
    Mar 2010
    Location
    Gtr Philadelphia, PA
    Posts
    2,016

    Default

    Nope - I see no difference with heartbeats disabled (except the netstat send-Q stays at zero while the cable is unplugged) but, after reconnecting, the next message is received ok.

    I'll see if I can load up 2.8.4 on another VM later; can you try 3.0.0. ??
    Gary P. Russell
    Spring Integration Team
    SpringSource, a division of VMware

  10. #10
    Join Date
    Mar 2012
    Posts
    11

    Default

    I'll try with RabbitMQ 3.0 tomorrow. Is spring-amqp 1.1.2 fully compatible with rabbitmq 3.0?

    My environment is all windows. Client: Win7, Server: win server 2008

    Regarding your point on auto-ack setting, we picked that option as we don't want message redelivery in case of consumer or broker crash.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •