Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-14782

Unable to reenable rebalance after cluster scale up

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • 14.0.8.Final
    • Core, Operator
    • None

      The operator graceful shutdown procedure used for an upgrade is as follows:

      1. Disable rebalance
      2. Call /container/?action=shutdown on each cluster member
      3. Scale cluster down to 0 pods
      4. Scale cluster back to original number of pods
      5. Wait for /cache-managers/default/health to report all members
      6. Enable rebalance
      7. Upgrade complete

      However, step 6 sometimes fails with the following server side exception and the upgrade is unable to proceed:

      12:45:14,469 ERROR (blocking-thread--p3-t1) [org.infinispan.rest.RestRequestHandler] ISPN012005: An error occurred while responding to the client java.util.concurrent.CompletionException: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from test-upgrade-1-11326, see cause for remote stack trace
      	at org.infinispan.util.concurrent.CompletionStages.join(CompletionStages.java:87)
      	at org.infinispan.topology.LocalTopologyManagerImpl.setCacheRebalancingEnabled(LocalTopologyManagerImpl.java:707)
      	at org.infinispan.topology.LocalTopologyManagerImpl.setRebalancingEnabled(LocalTopologyManagerImpl.java:659)
      	at org.infinispan.rest.resources.ContainerResource.lambda$setRebalancing$2(ContainerResource.java:188)
      	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
      	at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
      	at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
      	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
      	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
      	at java.base/java.lang.Thread.run(Thread.java:833)
      Caused by: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from test-upgrade-1-11326, see cause for remote stack trace
      	at org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:25)
      	at org.infinispan.remoting.transport.impl.VoidResponseCollector.addException(VoidResponseCollector.java:47)
      	at org.infinispan.remoting.transport.impl.VoidResponseCollector.addException(VoidResponseCollector.java:19)
      	at org.infinispan.remoting.transport.ValidResponseCollector.addResponse(ValidResponseCollector.java:29)
      	at org.infinispan.topology.TopologyManagementHelper.lambda$addLocalResult$2(TopologyManagementHelper.java:129)
      	at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934)
      	at java.base/java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:950)
      	at java.base/java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2340)
      	at java.base/java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:144)
      	at org.infinispan.topology.TopologyManagementHelper.lambda$addLocalResult$3(TopologyManagementHelper.java:125)
      	at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150)
      	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
      	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
      	at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:68)
      	at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:106)
      	at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:51)
      	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1578)
      	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1480)
      	at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1685)
      	at org.jgroups.JChannel.up(JChannel.java:733)
      	at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:921)
      	at org.jgroups.protocols.FRAG2.up(FRAG2.java:138)
      	at org.jgroups.protocols.FlowControl.up(FlowControl.java:245)
      	at org.jgroups.protocols.FlowControl.up(FlowControl.java:245)
      	at org.jgroups.protocols.pbcast.GMS.up(GMS.java:845)
      	at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:226)
      	at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1052)
      	at org.jgroups.protocols.UNICAST3.addMessage(UNICAST3.java:794)
      	at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:776)
      	at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:425)
      	at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:658)
      	at org.jgroups.protocols.VERIFY_SUSPECT2.up(VERIFY_SUSPECT2.java:105)
      	at org.jgroups.protocols.FailureDetection.up(FailureDetection.java:180)
      	at org.jgroups.protocols.FD_SOCK2.up(FD_SOCK2.java:188)
      	at org.jgroups.protocols.MERGE3.up(MERGE3.java:274)
      	at org.jgroups.protocols.Discovery.up(Discovery.java:294)
      	at org.jgroups.stack.Protocol.up(Protocol.java:314)
      	at org.jgroups.protocols.TP.passMessageUp(TP.java:1178)
      	at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:100)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
      	... 1 more
      Caused by: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from test-upgrade-1-11326, see cause for remote stack trace
      	at org.infinispan.topology.TopologyManagementHelper.makeResponse(TopologyManagementHelper.java:138)
      	at org.infinispan.topology.TopologyManagementHelper.lambda$addLocalResult$2(TopologyManagementHelper.java:126)
      	... 37 more
      Caused by: java.lang.NullPointerException: Cannot invoke "org.infinispan.topology.CacheTopology.getTopologyId()" because "cacheTopology" is null
      	at org.infinispan.commands.topology.TopologyUpdateStableCommand.<init>(TopologyUpdateStableCommand.java:44)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastStableTopologyUpdate(ClusterTopologyManagerImpl.java:643)
      	at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:835)
      	at java.base/java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4780)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.setRebalancingEnabled(ClusterTopologyManagerImpl.java:689)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.setRebalancingEnabled(ClusterTopologyManagerImpl.java:665)
      	at org.infinispan.commands.topology.RebalancePolicyUpdateCommand.invokeAsync(RebalancePolicyUpdateCommand.java:37)
      	at org.infinispan.topology.TopologyManagementHelper.invokeAsync(TopologyManagementHelper.java:151)
      	at org.infinispan.topology.TopologyManagementHelper.executeOnClusterSync(TopologyManagementHelper.java:51)
      	at org.infinispan.topology.LocalTopologyManagerImpl.setCacheRebalancingEnabled(LocalTopologyManagerImpl.java:707)
      	at org.infinispan.topology.LocalTopologyManagerImpl.setRebalancingEnabled(LocalTopologyManagerImpl.java:659)
      	at org.infinispan.rest.resources.ContainerResource.lambda$setRebalancing$2(ContainerResource.java:188)
      	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
      	at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
      	at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
      	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
      	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
      	... 1 more
      

        1. terminal
          184 kB
        2. trace_logs
          510 kB

              remerson@redhat.com Ryan Emerson
              remerson@redhat.com Ryan Emerson
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: