Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-939

Index corruption when remote node dies during commit

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • 4.2.1.CR2
    • Lucene Directory
    • None

      Using a scenario similar to the one described in ISPN-909:

      Infinispan: 3 caches: lockCache (replicated, volatile, no eviction), metadataCache (replicated, persisted, no eviction), dataCache (distributed, persisted, eviction, hash numOwners=2)
      Node 1: coordinator, IndexWriter open constantly and writing a stream of documents, committing after each one
      Node 2: opens a read-only IndexReader to perform queries, using reopen to keep in sync with the updates coming from node 1

      If we "kill -9" node 2 (to simulate a crash), we get a SuspectException in node 1 during the pre-commit phase (within IndexWriter.commit()). Catching the Throwable we then close() the writer but from then on we get "Read past EOF" errors when trying to access the index (both with readers and writers).

        1. read_past_eof.log
          1 kB
          Tristan Tarrant
        2. suspect_exception_node1.log
          6 kB
          Tristan Tarrant

              sgrinove Sanne Grinovero (Inactive)
              ttarrant@redhat.com Tristan Tarrant
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: