Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-8122

[7.8.EAP] Unhandled NullPointerException in JournalTransaction::forget

    XMLWordPrintable

Details

    Description

      In a situation where the broker is restarting after a file system error, we see an unhandled exception in org.apache.activemq.artemis.core.journal.impl.JournalTransaction::forget upon restart, resulting in a failed boot:

      05/24/23 14:30:41,569 ERROR [] [] [org.jboss.msc.service.fail] (ServerService Thread Pool -- 76) MSC000001: Failed to start service jboss.messaging-activemq.active.jms.manager: org.jboss.msc.service.StartException in service jboss.messaging-activemq.active.jms.manager: java.lang.Exception
      	at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService.lambda$doStart$0(JMSService.java:147)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.callActivationFailureListeners(ActiveMQServerImpl.java:2434)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.server.impl.SharedStoreLiveActivation.onActivationFailure(SharedStoreLiveActivation.java:118)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.server.impl.SharedStoreLiveActivation.run(SharedStoreLiveActivation.java:111)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.internalStart(ActiveMQServerImpl.java:639)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.start(ActiveMQServerImpl.java:558)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.jms.server.impl.JMSServerManagerImpl.start(JMSServerManagerImpl.java:374)
      	at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService.doStart(JMSService.java:211)
      	at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService.access$000(JMSService.java:65)
      	at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService$1.run(JMSService.java:100)
      	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at org.jboss.threads@2.4.0.Final-redhat-00001//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
      	at org.jboss.threads@2.4.0.Final-redhat-00001//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1990)
      	at org.jboss.threads@2.4.0.Final-redhat-00001//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
      	at org.jboss.threads@2.4.0.Final-redhat-00001//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1348)
      	at java.base/java.lang.Thread.run(Thread.java:829)
      	at org.jboss.threads@2.4.0.Final-redhat-00001//org.jboss.threads.JBossThread.run(JBossThread.java:513)
      Caused by: java.lang.Exception
      	at org.apache.activemq.artemis.journal//org.apache.activemq.artemis.core.journal.impl.JournalImpl.readJournalFile(JournalImpl.java:866)
      	at org.apache.activemq.artemis.journal//org.apache.activemq.artemis.core.journal.impl.JournalImpl.load(JournalImpl.java:2028)
      	at org.apache.activemq.artemis.journal//org.apache.activemq.artemis.core.journal.impl.JournalImpl.load(JournalImpl.java:2319)
      	at org.apache.activemq.artemis.journal//org.apache.activemq.artemis.core.journal.impl.JournalImpl.load(JournalImpl.java:1629)
      	at org.apache.activemq.artemis.journal//org.apache.activemq.artemis.core.journal.Journal.load(Journal.java:259)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.loadMessageJournal(AbstractJournalStorageManager.java:874)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.loadJournals(ActiveMQServerImpl.java:3518)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.initialisePart2(ActiveMQServerImpl.java:3183)
      	at org.apache.activemq.artemis@2.16.0.redhat-00034//org.apache.activemq.artemis.core.server.impl.SharedStoreLiveActivation.run(SharedStoreLiveActivation.java:94)
      	... 14 more
      Caused by: java.lang.NullPointerException
      	at org.apache.activemq.artemis.journal//org.apache.activemq.artemis.core.journal.impl.JournalTransaction.forget(JournalTransaction.java:358)
      	at org.apache.activemq.artemis.journal//org.apache.activemq.artemis.core.journal.impl.JournalImpl$12.onReadCommitRecord(JournalImpl.java:2214)
      	at org.apache.activemq.artemis.journal//org.apache.activemq.artemis.core.journal.impl.JournalImpl.readJournalFile(JournalImpl.java:835)
      	... 22 more
      

      The context of this issue was a broker cluster running on a replicated NFS setup where the NFS file system was failed over and we observed Critical IO Errors on the brokers due to stale file handles - presumably due to some replication lag.

      Upon restart, several of the brokers failed to start, logging the errors above. In this instance, the brokers were embedded brokers running in JBoss EAP 7.4.4 (broker version 2.16.0.redhat-00034)

      Attachments

        Issue Links

          Activity

            People

              csuconic@redhat.com Clebert Suconic
              rhn-support-dhawkins Duane Hawkins
              Roman Vais Roman Vais
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: