Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-4490

Members can miss the rebalance cancellation on coordinator change

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 7.0.0.Alpha5
    • 7.0.0.Alpha4
    • Core, State Transfer
    • None

      The new coordinator sends first a CH_UPDATE command to cancel the existing rebalance, and then a REBALANCE_START command to start a new rebalance. But the CH_UPDATE command is sent asynchronously, so it's possible for some members to receive it after the REBALANCE_START command.

      If that happens, that node will assume that it will receive the segments it requested for the previous rebalance. But with the ISPN-4484 fix, the provider node cancels the outbound transfer tasks when receiving a CH_UPDATE without a pendingCH, so the state requestor will never receive its segments.

      Even without the ISPN-4484 fix this is a problem, although less obvious. Between the provider node receiving the CH_UPDATE and the REBALANCE_START commands, it won't have the requestor in its write CH, so the requestor can miss transactions.

            dberinde@redhat.com Dan Berindei (Inactive)
            dberinde@redhat.com Dan Berindei (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: