Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-3366

Data loss when entry forwarding to primary owner and primary owner shutdown

    Details

      Description

      Looks like a problem in entry forwarding.

      Here is test scenario:

      • DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
      • HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total

      After the test run, the numberOfEntries on each node are:

      • node1: 26608
      • node2: 26622
      • node3: 26746
      • node4: 0

      Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.

      Let's take a look at the missing entry, hash(thread16key59) = 574ff563.

      Current CH: owners(574ff563) are [node4, node1]

      The events sequence is:

      • hotrod -> node1
      • node1 forwarding it to primary owner node4
      • node4 doesn't process the forwarded entry, shutdown

      Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.

        Gliffy Diagrams

          Attachments

          1. ISPN-3366-full-logs-3rd.zip
            6.94 MB
          2. ISPN-3366-full-logs-4th.zip
            5.95 MB
          3. ISPN-3366-logs.zip
            725 kB

            Issue Links

              Activity

                People

                • Assignee:
                  dan.berindei Dan Berindei
                  Reporter:
                  tkimura Takayoshi Kimura
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  9 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: