Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-2056

Unclean broker shutdown causes message loss on CIFS kahadb storage

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Done
    • Affects Version/s: JBoss A-MQ 6.3, JBoss A-MQ 6.3.x
    • Fix Version/s: JBoss A-MQ 6.3.x
    • Component/s: kahadb
    • Labels:
      None
    • Steps to Reproduce:
      Hide
      1. deploy master-slave broker with shared CIFS kahadb storage
      2. send messages to queue
      3. deploy camel route which transfer messages from one queue to another
      4. kill broker repeatedly while messages are being transfered
        • repeat steps until broker refuses to start due to missing KahaDB journal
      Show
      deploy master-slave broker with shared CIFS kahadb storage send messages to queue deploy camel route which transfer messages from one queue to another kill broker repeatedly while messages are being transfered repeat steps until broker refuses to start due to missing KahaDB journal

      Description

      I deployed two brokers on windows machines on openstack in master slave setup. KahaDB is located on shared CIFS storage. Then I sent messages into master broker and start transacted camel route which transfered messages from one queue to another one. During message transfer I repeatedly kill master container. At some point broker refused to start because of missing KahaDB journal.

      09:50:55,169 | INFO  | {AMQ-1-thread-1} [io.fabric8.mq.fabric.ActiveMQServiceFactory$ClusteredConfiguration$1] (ActiveMQServiceFactory.java:502) | 231 - io.fabric8.mq.mq-fabric - 1.2.0.redhat-630187 | Broker amq failed to start.  Will try again in 10 seconds
      09:50:55,169 | ERROR | {AMQ-1-thread-1} [io.fabric8.mq.fabric.ActiveMQServiceFactory$ClusteredConfiguration$1] (ActiveMQServiceFactory.java:503) | 231 - io.fabric8.mq.mq-fabric - 1.2.0.redhat-630187 | Exception on start: java.io.IOException: Detected missing journal files. [6]
      java.io.IOException: Detected missing journal files. [6]
      	at org.apache.activemq.store.kahadb.MessageDatabase.recoverIndex(MessageDatabase.java:935)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.store.kahadb.MessageDatabase$5.execute(MessageDatabase.java:676)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.store.kahadb.disk.page.Transaction.execute(Transaction.java:779)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.store.kahadb.MessageDatabase.recover(MessageDatabase.java:673)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:429)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:447)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:283)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:205)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.store.kahadb.KahaDBPersistenceAdapter.doStart(KahaDBPersistenceAdapter.java:223)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.broker.BrokerService.doStartPersistenceAdapter(BrokerService.java:658)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.broker.BrokerService.startPersistenceAdapter(BrokerService.java:642)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at org.apache.activemq.broker.BrokerService.start(BrokerService.java:607)[219:org.apache.activemq.activemq-osgi:5.11.0.redhat-630187]
      	at io.fabric8.mq.fabric.ActiveMQServiceFactory$ClusteredConfiguration.doStart(ActiveMQServiceFactory.java:549)[231:io.fabric8.mq.mq-fabric:1.2.0.redhat-630187]
      	at io.fabric8.mq.fabric.ActiveMQServiceFactory$ClusteredConfiguration.access$400(ActiveMQServiceFactory.java:359)[231:io.fabric8.mq.mq-fabric:1.2.0.redhat-630187]
      	at io.fabric8.mq.fabric.ActiveMQServiceFactory$ClusteredConfiguration$1.run(ActiveMQServiceFactory.java:490)[231:io.fabric8.mq.mq-fabric:1.2.0.redhat-630187]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_111]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_111]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_111]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_111]
      	at java.lang.Thread.run(Thread.java:745)[:1.8.0_111]
      

      I have also tried to ignore missing journal by using ignoreMissingJournalFile=true KahaDB option. Although this workaround enables broker to start it helps only in case there is small number of messages in the system (up to 500 000 messages). However, when I apply the workaround when there is 1 000 000 or more messages, then workaround enables broker to start, but a large number of messages is lost.

        Gliffy Diagrams

          Attachments

          1. 10.8.181.87-client.log
            564 kB
            Jakub Knetl
          2. 10.8.181.88-broker1.log
            3.11 MB
            Jakub Knetl
          3. 10.8.181.90-broker2.log
            3.08 MB
            Jakub Knetl
          4. kahadb.tar.gz.01
            15.00 MB
            Jakub Knetl
          5. kahadb.tar.gz.02
            15.00 MB
            Jakub Knetl
          6. kahadb.tar.gz.03
            14.68 MB
            Jakub Knetl

            Issue Links

              Activity

                People

                • Assignee:
                  garytully Gary Tully
                  Reporter:
                  jknetl Jakub Knetl
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: