Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-9441

AMQ 7.12: NullPointerException and other inexplicable error messages in a four-broker mesh

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Blocker Blocker
    • None
    • AMQ 7.12.1.GA
    • broker-core, clustering
    • None
    • None
    • False
    • Critical

       
      Fully-connected mesh of four AMQ 7.12 brokers. The configuration is rather complicated, with many diverts and security settings. In circumstances which are, at present, somewhat unclear, one or more brokers will start to misbehave. The misbehaviour might be apparent within a second of so of starting the broker, before it has even fully started. However, it seems that sometimes the characteristics sequence of error messages can appear on any broker. Usually multiple brokers are affected at the same time.

      Here are some of the messages that appear. When they do appear, they will appear frequently, and the affected broker will appear to be disconnected from the rest of the cluster.

      2024-09-23 09:38:51,718 INFO  [org.apache.activemq.artemis.core.server] AMQ221027: Bridge ClusterConnectionBridge@739bcdef .... discoveryGroupConfiguration=null]] is connected
      2024-09-23 09:38:51,720 WARN  [org.apache.activemq.artemis.core.server] AMQ222110: no queue IDs defined!,  originalMessage  = CoreMessage[messageID=2214253615, durable=true, userID=c744376a-74af-11ef-a1d3-0050569e8339, priority=4, timestamp=Tue Sep 17 06:46:18 CEST 2024,...1, JMSCorrelationID=INGBNL2A_TX20240917 1716171046783193]]@730958237, props=_AMQ_ROUTE_TO$.artemis.internal.sf.my-cluster.3e66ebf6-5bc2-11ef-9ac6-0050569eb015
      
      2024-09-23 09:38:51,819 WARN  [org.apache.activemq.artemis.core.server] AMQ222095: Connection failed with failedOver=false
      2024-09-23 09:38:51,929 ERROR [org.apache.activemq.artemis.core.server] AMQ224030: Could not cancel reference Reference[2214253615]:RELIABLE:CoreMessage[messageID=2214253615, durable=true, userID=c744376a-74af-11ef-a1d3-0050569e8339, priority=4, timestamp=Tue Sep 17 06:46:18 CEST 2024, expiration=0, durable=true, address=instantpayments_abnamro_beneficiary_payment_confirmation, size=2557, properties=TypedProperties[__AMQ_CID=ipint1fl101, _AMQ_ROUTING_TYPE=1, JMSCorrelationID=INGBNL2A_TX20240917 1716171046783193]]@1269696972
      java.lang.NullPointerException: Cannot invoke "org.apache.activemq.artemis.core.server.MessageReference.getSequence()" because "o1" is null
              at org.apache.activemq.artemis.core.server.impl.MessageReferenceImpl$MessageReferenceComparatorSequence.compare(MessageReferenceImpl.java:50) ~[artemis-server-2.33.0.redhat-00010.jar:2.33.0.redhat-00010]
      
      2024-09-23 09:38:51,819 WARN  [org.apache.activemq.artemis.core.server] AMQ222095: Connection failed with failedOver=false
      024-09-23 09:38:55,631 ERROR [org.apache.activemq.artemis.core.server] AMQ224030: Could not cancel reference Reference[2214253615]:RELIABLE:CoreMessage[messageID=2214253615, durable=true, userID=c744376a-74af-11ef-a1d3-0050569e8339, priority=4, timestamp=Tue Sep 17 06:46:18 CEST 2024, expiration=0, durable=true, address=instantpayments_abnamro_beneficiary_payment_confirmation, size=2557, properties=TypedProperties[__AMQ_CID=ipint1fl101, _AMQ_ROUTING_TYPE=1, JMSCorrelationID=INGBNL2A_TX20240917 1716171046783193]]@1269696972
      java.lang.NullPointerException: null

      Sometimes we also see an OutOfMemoryException mixed up with these error messages, but not always.

      The problem occurs only infrequently, perhaps every week or two. When it does, there appears to be no solution but to delete the entire message store on each broker, and restart the cluster. Restarting alone does not seem to fix the problem.

       

              Unassigned Unassigned
              rhn-support-kboone Kevin Boone
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: