Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-13012

XSiteAutoStateTransferTest.testNewSiteMasterStartsStateTransfer random failures

    XMLWordPrintable

    Details

      Description

      The test ignores some (non-xsite) state transfer commands, but does not ignore StateTransferCancelCommand, and fails when NodeB sends one:

      17:03:27,940 DEBUG (non-blocking-thread-Test-NodeB-p23613-t4:[]) [ControlledRpcManager] Intercepted command to [Test-NodeC]: StateTransferCancelCommand{topologyId=11, segments={4 6 8-12}, cacheName=defaultcache}
      17:03:28,033 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.xsite.statetransfer.XSiteAutoStateTransferTest.testNewSiteMasterStartsStateTransfer
      java.util.concurrent.ExecutionException: java.lang.AssertionError: Expecting a org.infinispan.xsite.commands.XSiteAutoTransferStatusCommand, got StateTransferCancelCommand{topologyId=11, segments={4 6 8-12}, cacheName=defaultcache}
      	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) ~[?:?]
      	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2022) ~[?:?]
      	at org.infinispan.xsite.statetransfer.XSiteAutoStateTransferTest.testNewSiteMasterStartsStateTransfer(XSiteAutoStateTransferTest.java:308) ~[test-classes/:?]
      

      Normally NodeB does not send a StateTransferCancelCommand, but sometimes when the test kills NodeA, it manages to start a rebalance with members [NodeB, NodeC]. Then NodeB becomes coordinator and cancels the rebalance before starting a new one, with the same members.

      17:03:27,920 TRACE (testng-Test:[]) [JGroupsTransport] Test-NodeA sending command to all: RebalanceStartCommand{cacheName='defaultcache', origin=Test-NodeA, currentCH=DefaultConsistentHash{ns=21, owners = (2)[Test-NodeB: 9+5, Test-NodeC: 12+2]}, pendingCH=DefaultConsistentHash{ns=21, owners = (2)[Test-NodeB: 10+11, Test-NodeC: 11+10]}, phase=READ_OLD_WRITE_ALL, actualMembers=[Test-NodeB, Test-NodeC], persistentUUIDs=[5994a338-b58d-440a-939f-75db7bb5c387, dc5282ec-4b27-4bac-b48e-e0138cf7258f], rebalanceId=4, topologyId=11, viewId=2}
      17:03:27,929 DEBUG (non-blocking-thread-Test-NodeB-p23613-t6:[Merge-3]) [PreferAvailabilityStrategy] Recovered a single partition for cache defaultcache: CacheTopology{id=11, phase=READ_OLD_WRITE_ALL, rebalanceId=4, currentCH=DefaultConsistentHash{ns=21, owners = (2)[Test-NodeB: 9+5, Test-NodeC: 12+2]}, pendingCH=DefaultConsistentHash{ns=21, owners = (2)[Test-NodeB: 10+11, Test-NodeC: 11+10]}, unionCH=null, actualMembers=[Test-NodeB, Test-NodeC], persistentUUIDs=[5994a338-b58d-440a-939f-75db7bb5c387, dc5282ec-4b27-4bac-b48e-e0138cf7258f]}
      17:03:27,929 DEBUG (non-blocking-thread-Test-NodeB-p23613-t6:[Merge-3]) [CLUSTER] ISPN000521: Cache defaultcache recovered after merge with topology = CacheTopology{id=12, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash{ns=21, owners = (2)[Test-NodeB: 9+5, Test-NodeC: 12+2]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeB, Test-NodeC], persistentUUIDs=[5994a338-b58d-440a-939f-75db7bb5c387, dc5282ec-4b27-4bac-b48e-e0138cf7258f]}, availability mode null
      
      17:03:27,932 TRACE (non-blocking-thread-Test-NodeC-p23642-t4:[]) [StateConsumerImpl] Received new topology for cache defaultcache, isRebalance = false, isMember = true, topology = CacheTopology{id=12, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash{ns=21, owners = (2)[Test-NodeB: 9+5, Test-NodeC: 12+2]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeB, Test-NodeC], persistentUUIDs=[5994a338-b58d-440a-939f-75db7bb5c387, dc5282ec-4b27-4bac-b48e-e0138cf7258f]}
      17:03:27,932 TRACE (non-blocking-thread-Test-NodeC-p23642-t4:[]) [StateConsumerImpl] On cache defaultcache we have: added segments: {}; removed segments: {7 14-19}
      17:03:27,932 TRACE (non-blocking-thread-Test-NodeC-p23642-t4:[]) [InboundTransferTask] Partially cancelling inbound state transfer from node Test-NodeB, segments {7 14-19}
      17:03:27,933 TRACE (non-blocking-thread-Test-NodeC-p23642-t4:[]) [JGroupsTransport] Test-NodeC sending command to Test-NodeB: StateTransferCancelCommand{topologyId=11, segments={7 14-19}, cacheName=defaultcache}
      

        Attachments

          Activity

            People

            Assignee:
            dan.berindei Dan Berindei
            Reporter:
            dan.berindei Dan Berindei
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: