Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-11682

Clustered SLSB membership anomalies when all cluster members removed

    XMLWordPrintable

Details

    • Hide
      1. Install a plain WildFly server
      2. Build the playground.zip project attached to this Jira using Maven
      3. Copy the build playground.jar to $JBOSS_HOME/standalone/deployments
      4. Multiply the standalone folder into node1, node2 and node3
      5. Start the servers:
        $JBOSS_HOME/bin/standalone.sh -c standalone-ha.xml -Djboss.node.name=node1 -Djboss.server.base.dir=$JBOSS_HOME/node1
        $JBOSS_HOME/bin/standalone.sh -c standalone-ha.xml -Djboss.node.name=node2 -Djboss.server.base.dir=$JBOSS_HOME/node2 -Djboss.socket.binding.port-offset=300
        $JBOSS_HOME/bin/standalone.sh -c standalone-ha.xml -Djboss.node.name=node3 -Djboss.server.base.dir=$JBOSS_HOME/node3 -Djboss.socket.binding.port-offset=600
      6. start the client (300 = 5 minutes):
        mvn -f playground-jar/pom.xml exec:exec -Druntime=300
      7. Stop node1
      8. Stop node2
      9. Stop node3
      10. Start node1 (see above) again
      11. Observe node3 in the list of available node, see:
        INFO (ThreadPoolTaskExecutor-5) [com.jboss.examples.ejb.CustomClusterNodeSelector] connectedNodes(1) '[node1]', availableNodes(2) '[node3, node1]'
      Show
      Install a plain WildFly server Build the playground.zip project attached to this Jira using Maven Copy the build playground.jar to $JBOSS_HOME/standalone/deployments Multiply the standalone folder into node1 , node2 and node3 Start the servers: $JBOSS_HOME/bin/standalone.sh -c standalone-ha.xml -Djboss.node.name=node1 -Djboss.server.base.dir=$JBOSS_HOME/node1 $JBOSS_HOME/bin/standalone.sh -c standalone-ha.xml -Djboss.node.name=node2 -Djboss.server.base.dir=$JBOSS_HOME/node2 -Djboss.socket.binding.port-offset=300 $JBOSS_HOME/bin/standalone.sh -c standalone-ha.xml -Djboss.node.name=node3 -Djboss.server.base.dir=$JBOSS_HOME/node3 -Djboss.socket.binding.port-offset=600 start the client (300 = 5 minutes): mvn -f playground-jar/pom.xml exec:exec -Druntime=300 Stop node1 Stop node2 Stop node3 Start node1 (see above) again Observe node3 in the list of available node, see: INFO (ThreadPoolTaskExecutor-5) [com.jboss.examples.ejb.CustomClusterNodeSelector] connectedNodes(1) ' [node1] ', availableNodes(2) ' [node3, node1] '

    Description

      This description will be based on a 3 node cluster. Cluster node 1 and 2 are configured in the PROVIDER_URL, node 3 is not.

      The client has a custom ClusterNodeSelector implementation that is printing the connectedNodes and the availableNodes and doing a random balancing.

      As long as all nodes are up and running the client is calling EJBs in a balanced way.

      When node1 is shut down, the client get the notification below:

      ...
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
      DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
      DEBUG (XNIO-1 task-4) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node1)
      ...
      

      Then node2 is shut down. Again the client get the information, see:

      ...
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received CLUSTER_TOPOLOGY_NODE_REMOVAL(18) message for (cluster, node) = (ejb, node2)
      ...
      

      Finally node3 is being shut down. Now the client only get the following information:

      ...
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
      DEBUG (XNIO-1 task-1) [org.jboss.ejb.client.invocation] Received MODULE_UNAVAILABLE(9) message for module /playground
      ...
      

      This mean the node3 is not being informed about the fact that the last node of the cluster has been stopped.

      From this point on the client is always getting Caused by: java.net.ConnectException: Connection refused

      Now node1 is started again, resulting in the following output for connectedNodes and the availableNodes:

      ...
      INFO  (ThreadPoolTaskExecutor-1) [com.jboss.examples.ejb.CustomClusterNodeSelector] connectedNodes(1) '[node1]', availableNodes(2) '[node3, node1]'
      ...
      

      Attachments

        1. node1.txt
          122 kB
        2. node12.txt
          508 kB
        3. node2.txt
          486 kB
        4. node3.txt
          1.72 MB
        5. playground.zip
          16 kB

        Issue Links

          Activity

            People

              rachmato@redhat.com Richard Achmatowicz
              rhn-support-jbaesner Joerg Baesner
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: