Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-2067

Broker Does Not Recover Cleanly from JDBC Datasource Outage

XMLWordPrintable

    • Hide

      Working on reproducer. Reproduced intermittently by connecting a master-slave pair to a local Oracle 11 instance and starting / stopping the instance. Appears to occur on the master broker. Configuration, log, and thread dumps attached.

      Show
      Working on reproducer. Reproduced intermittently by connecting a master-slave pair to a local Oracle 11 instance and starting / stopping the instance. Appears to occur on the master broker. Configuration, log, and thread dumps attached.

      When a broker using JDBC persistence with the lease locker recovers from a database outage, along with default values for the jDBCIOExceptionHandler (or leaseLockerIOExceptionHandler):

      stopStartConnectors=true
      

      The broker intermittently comes back in an inconsistent state. Sometimes, it appears that the JDBC persistence configuration is no longer in memory, resulting in defaulting to kahadb persistence, with resulting exceptions when the broker tries to recover messages:

      2017-02-17 11:28:04,912 | INFO  | start transports | DefaultIOExceptionHandler        | il.DefaultIOExceptionHandler$1$1  109 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | waiting for broker persistence adapter checkpoint to succeed before restarting transports
      2017-02-17 11:28:05,998 | INFO  | A Scheduled Task | LeaseDatabaseLocker              | q.store.jdbc.LeaseDatabaseLocker  117 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | node11 Lease held by node12 till Fri Feb 17 11:28:14 EST 2017
      2017-02-17 11:28:05,999 | ERROR | A Scheduled Task | LockableServiceSupport           | mq.broker.LockableServiceSupport  143 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | node11, no longer able to keep the exclusive lock so giving up being a master
      2017-02-17 11:28:06,015 | INFO  | A Scheduled Task | BrokerService                    | he.activemq.broker.BrokerService  770 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Apache ActiveMQ 5.11.0.redhat-621084 (node11, amq) is shutting down
      2017-02-17 11:28:06,016 | INFO  | A Scheduled Task | TransportConnector               | tivemq.broker.TransportConnector  291 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Connector openwire stopped
      2017-02-17 11:28:06,017 | INFO  | A Scheduled Task | TransportConnector               | tivemq.broker.TransportConnector  291 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Connector amqp stopped
      2017-02-17 11:28:06,017 | INFO  | A Scheduled Task | TransportConnector               | tivemq.broker.TransportConnector  291 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Connector mqtt stopped
      2017-02-17 11:28:06,018 | INFO  | A Scheduled Task | TransportConnector               | tivemq.broker.TransportConnector  291 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Connector ws stopped
      2017-02-17 11:28:06,021 | INFO  | A Scheduled Task | PListStoreImpl                   | tore.kahadb.plist.PListStoreImpl  356 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | PListStore:[/opt/amq/containers/jboss-a-mq-6.2.1.redhat-084/data/amq/node11/tmp_storage] stopped
      2017-02-17 11:28:06,025 | INFO  | A Scheduled Task | BrokerService                    | he.activemq.broker.BrokerService  831 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Apache ActiveMQ 5.11.0.redhat-621084 (node11, amq) uptime 39 minutes
      2017-02-17 11:28:06,025 | INFO  | A Scheduled Task | BrokerService                    | he.activemq.broker.BrokerService  833 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Apache ActiveMQ 5.11.0.redhat-621084 (node11, amq) is shutdown
      2017-02-17 11:28:06,025 | INFO  | A Scheduled Task | ActiveMQServiceFactory           | mq.fabric.ActiveMQServiceFactory  102 | 151 - io.fabric8.mq.mq-fabric - 1.2.0.redhat-621084 | Broker 'amq' shut down, giving up being master
      2017-02-17 11:28:09,916 | WARN  | start transports | DefaultIOExceptionHandler        | il.DefaultIOExceptionHandler$1$1  127 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Stopping BrokerService[node11] due to failure restarting transports
      java.lang.NullPointerException
              at org.apache.activemq.store.kahadb.KahaDBStore.getDestinations(KahaDBStore.java:1037)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.store.kahadb.KahaDBPersistenceAdapter.getDestinations(KahaDBPersistenceAdapter.java:144)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.region.DestinationFactoryImpl.getDestinations(DestinationFactoryImpl.java:68)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.region.RegionBroker.getDurableDestinations(RegionBroker.java:680)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.MutableBrokerFilter.getDurableDestinations(MutableBrokerFilter.java:262)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerService.startAllConnectors(BrokerService.java:2484)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.util.DefaultIOExceptionHandler$1$1.run(DefaultIOExceptionHandler.java:123)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
      2017-02-17 11:28:09,916 | INFO  | start transports | DefaultIOExceptionHandler        | q.util.DefaultIOExceptionHandler  165 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Stopping BrokerService[node11] due to exception, java.lang.NullPointerException
      java.lang.NullPointerException
              at org.apache.activemq.store.kahadb.KahaDBStore.getDestinations(KahaDBStore.java:1037)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.store.kahadb.KahaDBPersistenceAdapter.getDestinations(KahaDBPersistenceAdapter.java:144)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.region.DestinationFactoryImpl.getDestinations(DestinationFactoryImpl.java:68)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.region.RegionBroker.getDurableDestinations(RegionBroker.java:680)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerFilter.getDurableDestinations(BrokerFilter.java:257)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.MutableBrokerFilter.getDurableDestinations(MutableBrokerFilter.java:262)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.broker.BrokerService.startAllConnectors(BrokerService.java:2484)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
              at org.apache.activemq.util.DefaultIOExceptionHandler$1$1.run(DefaultIOExceptionHandler.java:123)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
      

      Other times, the broker exists, giving up becoming master:

      2017-02-23 13:43:33,521 | WARN  | start transports | LeaseLockerIOExceptionHandler    | il.LeaseLockerIOExceptionHandler   58 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Lock keepAlive failed, no longer lock owner with: LeaseDatabaseLocker owner:node2,duration:10000,renew:10000
      2017-02-23 13:43:33,522 | WARN  | start transports | DefaultIOExceptionHandler        | il.DefaultIOExceptionHandler$1$1  127 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Stopping BrokerService[node2] due to failure restarting transports
      java.io.IOException: Lock keepAlive failed, no longer lock owner with: LeaseDatabaseLocker owner:node2,duration:10000,renew:10000
      	at org.apache.activemq.util.LeaseLockerIOExceptionHandler.hasLockOwnership(LeaseLockerIOExceptionHandler.java:59)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
      	at org.apache.activemq.util.DefaultIOExceptionHandler$1$1.run(DefaultIOExceptionHandler.java:108)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
      2017-02-23 13:43:33,523 | INFO  | start transports | DefaultIOExceptionHandler        | q.util.DefaultIOExceptionHandler  165 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Stopping BrokerService[node2] due to exception, java.io.IOException: Lock keepAlive failed, no longer lock owner with: LeaseDatabaseLocker owner:node2,duration:10000,renew:10000
      java.io.IOException: Lock keepAlive failed, no longer lock owner with: LeaseDatabaseLocker owner:node2,duration:10000,renew:10000
      	at org.apache.activemq.util.LeaseLockerIOExceptionHandler.hasLockOwnership(LeaseLockerIOExceptionHandler.java:59)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
      	at org.apache.activemq.util.DefaultIOExceptionHandler$1$1.run(DefaultIOExceptionHandler.java:108)[137:org.apache.activemq.activemq-osgi:5.11.0.redhat-621084]
      2017-02-23 13:43:33,622 | INFO  | erService[node2] | BrokerService                    | he.activemq.broker.BrokerService  770 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Apache ActiveMQ 5.11.0.redhat-621084 (node2, amq) is shutting down
      2017-02-23 13:43:33,622 | INFO  | erService[node2] | TransportConnector               | tivemq.broker.TransportConnector  291 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Connector openwire stopped
      2017-02-23 13:43:33,623 | INFO  | erService[node2] | TransportConnector               | tivemq.broker.TransportConnector  291 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Connector amqp stopped
      2017-02-23 13:43:33,623 | INFO  | erService[node2] | TransportConnector               | tivemq.broker.TransportConnector  291 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Connector mqtt stopped
      2017-02-23 13:43:33,623 | INFO  | erService[node2] | TransportConnector               | tivemq.broker.TransportConnector  291 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Connector ws stopped
      2017-02-23 13:43:34,022 | INFO  | erService[node2] | JobSchedulerStoreImpl            | .scheduler.JobSchedulerStoreImpl  259 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | JobSchedulerStore: /opt/amq/containers/jboss-a-mq-6.2.1.redhat-084/data/amq/node2/scheduler stopped.
      2017-02-23 13:43:34,109 | INFO  | erService[node2] | PListStoreImpl                   | tore.kahadb.plist.PListStoreImpl  356 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | PListStore:[/opt/amq/containers/jboss-a-mq-6.2.1.redhat-084/data/amq/node2/tmp_storage] stopped
      2017-02-23 13:43:34,270 | INFO  | erService[node2] | BrokerService                    | he.activemq.broker.BrokerService  831 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Apache ActiveMQ 5.11.0.redhat-621084 (node2, amq) uptime 18 minutes
      2017-02-23 13:43:34,272 | INFO  | erService[node2] | BrokerService                    | he.activemq.broker.BrokerService  833 | 137 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621084 | Apache ActiveMQ 5.11.0.redhat-621084 (node2, amq) is shutdown
      2017-02-23 13:43:34,273 | INFO  | erService[node2] | ActiveMQServiceFactory           | mq.fabric.ActiveMQServiceFactory  102 | 151 - io.fabric8.mq.mq-fabric - 1.2.0.redhat-621084 | Broker 'amq' shut down, giving up being master
      

      After either of these scenarios, the broker may be in an unrecoverable state - it appears to give up trying to get the lock lease or restart.

        1. activemq.xml
          6 kB
        2. amq.log
          647 kB
        3. dumps.tar
          240 kB

              gtully@redhat.com Gary Tully
              rhn-support-dhawkins Duane Hawkins
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: