Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-4490

Members can miss the rebalance cancellation on coordinator change

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 7.0.0.Alpha5
    • 7.0.0.Alpha4
    • Core, State Transfer
    • None

    Description

      The new coordinator sends first a CH_UPDATE command to cancel the existing rebalance, and then a REBALANCE_START command to start a new rebalance. But the CH_UPDATE command is sent asynchronously, so it's possible for some members to receive it after the REBALANCE_START command.

      If that happens, that node will assume that it will receive the segments it requested for the previous rebalance. But with the ISPN-4484 fix, the provider node cancels the outbound transfer tasks when receiving a CH_UPDATE without a pendingCH, so the state requestor will never receive its segments.

      Even without the ISPN-4484 fix this is a problem, although less obvious. Between the provider node receiving the CH_UPDATE and the REBALANCE_START commands, it won't have the requestor in its write CH, so the requestor can miss transactions.

      Attachments

        Issue Links

          Activity

            People

              dberinde@redhat.com Dan Berindei (Inactive)
              dberinde@redhat.com Dan Berindei (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: