Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-4650

MassIndexer should not use UpdateDocument when adding to Lucene

    XMLWordPrintable

Details

    Description

      The MassIndexer currently causes a Delete plus and Add operation to hibernate search backend.
      Lucene buffers those deletes queries and during merge it tries to 'apply' those deletes wasting a massive amount of time doing seeks and queries unnecessarily.
      Since the mass indexer wipes the index at the beginning, it should simply issue an add operation (or at least rely on Lucene atomic IndexWriter.updateDocument). Performance wise this make a huge difference:

      • indexing 50k documents brings down the indexing time from 195s to 33s
      • indexing 200k documents brings down the indexing time from 600s to 55s

      Attachments

        Issue Links

          Activity

            People

              gfernand@redhat.com Gustavo Fernandes
              gfernand@redhat.com Gustavo Fernandes
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: