-
Bug
-
Resolution: Done
-
Major
-
8.2.0.Final
-
None
The Infinispan directory defaults to write_metadata_async=true when the indexing backend is configured as async, i.e. default.worker.execution is true.
The write_metadata_async=true will use cache.putAsync to write the index file metadata, while still deleting and creating files syncronously. This can lead to
a stale metadata causing FileNotFoundExceptions when executing queries:
Suppose a lucene directory contains files [segments_4, _4.si]. During normal regime, apart from the user thread, there could be other 2 threads that could be changing the index, the periodic commit thread (since backend is async) and the async deletion of files.
The following race can happen:
Time | Thread | work type | work |
---|---|---|---|
T1 | Hibernate Search: Commit Scheduler for index | SYNC | write files segments_5 and _5.si to the index |
T2 | Hibernate Search: Commit Scheduler for index | ASYNC | write the new file list containing [segments_4, _4.si, segments_5,_5.si] |
T3 | Hibernate Search: Commit Scheduler for index | ASYNC | enqueue a deletion task for files segments_4 and _4.si |
T4 | Hibernate Search: async deletion of index | SYNC | dequeue deletion task for files segments_4 and _4.si |
T5 | Hibernate Search: async deletion of index | SYNC | delete files segments_4 and _4.si from the index |
T6 | Hibernate Search: async deletion of index | ASYNC | write the new file list containing [segments_5,_5.si] |
T7 | User-thread | open index reader, file list is [segments_4, _4.si], highest segment number is 4 (file list is not updated yet) | |
T8 | User-thread | open segments_4 | |
T9 | User-thread | FileNotFoundException! | |
T10 | remote-thread-User | new file list received [segments_4, _4.si, segments_5,_5.si] | |
T11 | remote-thread-User | new file list received [segments_5,_5.si] |
This race can be observed in MassIndexerAsyncBackendTest#testMassIndexOnAsync that fails intermittently with the exception:
Caused by: java.io.FileNotFoundException: Error loading metadata for index file: M|segments_4|commonIndex|-1 at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT] at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT] at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:493) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:344) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:300) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:263) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34] at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
We should not enable write_metadata_async=true for async backends. The file list is already DeltaAware, so writing should not pose a meaningfull overhead when done synchronously.
- causes
-
ISPN-6337 MassIndexerAsyncBackendTest.testMassIndexOnAsync random failures
- Closed