Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-6425

FileNotFoundException with async indexing backend

    XMLWordPrintable

Details

    Description

      The Infinispan directory defaults to write_metadata_async=true when the indexing backend is configured as async, i.e. default.worker.execution is true.

      The write_metadata_async=true will use cache.putAsync to write the index file metadata, while still deleting and creating files syncronously. This can lead to
      a stale metadata causing FileNotFoundExceptions when executing queries:

      Suppose a lucene directory contains files [segments_4, _4.si]. During normal regime, apart from the user thread, there could be other 2 threads that could be changing the index, the periodic commit thread (since backend is async) and the async deletion of files.

      The following race can happen:

      Time Thread work type work
      T1 Hibernate Search: Commit Scheduler for index SYNC write files segments_5 and _5.si to the index
      T2 Hibernate Search: Commit Scheduler for index ASYNC write the new file list containing [segments_4, _4.si, segments_5,_5.si]
      T3 Hibernate Search: Commit Scheduler for index ASYNC enqueue a deletion task for files segments_4 and _4.si
      T4 Hibernate Search: async deletion of index SYNC dequeue deletion task for files segments_4 and _4.si
      T5 Hibernate Search: async deletion of index SYNC delete files segments_4 and _4.si from the index
      T6 Hibernate Search: async deletion of index ASYNC write the new file list containing [segments_5,_5.si]
      T7 User-thread   open index reader, file list is [segments_4, _4.si], highest segment number is 4 (file list is not updated yet)
      T8 User-thread   open segments_4
      T9 User-thread   FileNotFoundException!
      T10 remote-thread-User   new file list received [segments_4, _4.si, segments_5,_5.si]
      T11 remote-thread-User   new file list received [segments_5,_5.si]

      This race can be observed in MassIndexerAsyncBackendTest#testMassIndexOnAsync that fails intermittently with the exception:

      Caused by: java.io.FileNotFoundException: Error loading metadata for index file: M|segments_4|commonIndex|-1
      	at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
      	at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
      	at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:493) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:344) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:300) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:263) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      	at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
      

      We should not enable write_metadata_async=true for async backends. The file list is already DeltaAware, so writing should not pose a meaningfull overhead when done synchronously.

      Attachments

        Issue Links

          Activity

            People

              gfernand@redhat.com Gustavo Fernandes (Inactive)
              gfernand@redhat.com Gustavo Fernandes (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: