Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-4310

StateResponse chunk with lastChunk=true from cancelled ST stops receiving data in next ST

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XMLWordPrintable

      1. A requests segment from B (there are multiple chunks)
      2. B sends all chunks, but before A receives them, new topology arrives and A cancels the ST.
      3. Another topology comes and A requests this segment again
      4. A receives the old StateResponseCommand with lastChunk=true and thinks that it got all segments, therefore, it discards further chunks.

      Result is inconsistent cluster, and after further rebalances completely lost data.
      This ought to be rare, but was repeatedly observed when gracefully stopping coordinator on a 32-node cluster full of data.

              dberinde@redhat.com Dan Berindei (Inactive)
              rvansa1@redhat.com Radim Vansa (Inactive)
              Archiver:
              rhn-support-adongare Amol Dongare

                Created:
                Updated:
                Resolved:
                Archived: