Details

    • Type: Bug
    • Status: Verified (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 7.1.0.CR3
    • Fix Version/s: 7.2.0.GA.CR1
    • Component/s: ActiveMQ
    • Labels:
      None

      Description

      In tests based on MultiServerTestBase it sometimes happens that after all servers are started, the check waitForTopology fails with the following error.

      Timed out waiting for cluster topology of live=5,backup=5 (received live=4, backup=5) topology = topology on Topology@5884a914[owner=ClusterConnectionImpl@405215542[nodeUUID=bbbae377-ba40-11e7-aff3-fa163e312a80, connector=TransportConfiguration(name=bbbabc66-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=0, address=cluster-queues, server=ActiveMQServerImpl::BridgeFailoverTest/Live(0)]]:
       bbd79349-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbd79349-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=TransportConfiguration(name=bbd79348-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=1, b=TransportConfiguration(name=bbd79353-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=6], backupGroupName=null, scaleDownGroupName=null]
       bbd7935b-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbd7935b-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=TransportConfiguration(name=bbd7935a-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=2, b=TransportConfiguration(name=bbd7ba75-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=7], backupGroupName=null, scaleDownGroupName=null]
       bbbae377-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbbae377-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=TransportConfiguration(name=bbbabc66-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=0, b=TransportConfiguration(name=bbd76c31-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=5], backupGroupName=null, scaleDownGroupName=null]
       bbd7ba8f-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbd7ba8f-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=null, b=TransportConfiguration(name=bbd7ba99-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=9], backupGroupName=null, scaleDownGroupName=null]
       bbd7ba7d-ba40-11e7-aff3-fa163e312a80 => TopologyMember[id = bbd7ba7d-ba40-11e7-aff3-fa163e312a80, connector=Pair[a=TransportConfiguration(name=bbd7ba7c-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=3, b=TransportConfiguration(name=bbd7ba87-ba40-11e7-aff3-fa163e312a80, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=8], backupGroupName=null, scaleDownGroupName=null]
       nodes=9 members=5)
      

      I dug into this and found out that in some certain cases Live's topology update message has older event ID than Backup's update message and it is also received later. In these cases the Live's message is ignored, because it doesn't meet the condition as it is shown below in the code snippet.

      I think if the current node has no connector to Live, it shouldn't ignore topology update from Live even if it is older than the current record.

      public boolean updateMember(final long uniqueEventID, final String nodeId, final TopologyMemberImpl memberInput) {
      
         if (uniqueEventID > currentMember.getUniqueEventID()) {
                  ...
         }
         /*
          * always add the backup, better to try to reconnect to something that's not there then to
          * not know about it at all
          */
         if (currentMember.getBackup() == null && memberInput.getBackup() != null) {
            currentMember.setBackup(memberInput.getBackup());
         }
      }
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  treblereel Dmitrii Tikhomirov
                  Reporter:
                  eduda Erich Duda
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: