Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-9044

In Cluster - Infinispan - SingleFileStore - fetchPersistentState/StateTransfer not transferring complete data to Joining Node

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Cannot Reproduce
    • Affects Version/s: 8.2.5.Final
    • Fix Version/s: None
    • Component/s: Lucene Directory
    • Labels:
      None

      Description

      Infinispan - SingleFileStore - fetchPersistentState/StateTransfer not transferring complete data to Joining Node.

      Related to ISPN-8980 (https://issues.jboss.org/browse/ISPN-8980).
      We are using Hibernate Search Indexes - Lucene indexes being stored on Infinispan with SingleFileStore.

      In case of more than 1 node. For example 4 nodes. We are observing below behaviour.
      Below are the steps:

      1. We startup the first node 'N1' in maintenance mode - with MassIndexer - creating initial indexes.
      2. Now after all the MassIndexer/EntityLoader threads ends (after 1-2 Hrs). I.e. MassIndexing has been completed. We startup all other 3 nodes 'N2' , 'N3' and 'N4'. Without MassIndexer.
      3. Now on moderate to heavy application usage (concurrency), we are again getting the same exception of Exception occurred java.io.FileNotFoundException: Error loading metadata for index file. Which indicates, Some entries are not present in cache.
      4. But this exception comes only on the other 3 nodes (N2, N3 and N4). Not on the first node N1.
      5. On checking the sizes of the Cache stores in all the Nodes, the 3 Nodes (N2,N3 and N4) are having almost equal size (600 MB), which is 50%-70% of the size of Cache Stores of N1 (1.2 GB).
      6. We have repeated these steps multiple times. Even switched MassIndexing node to other 3 nodes too. We have even reduced the number of nodes to 2.
      7. But the behaviour is exactly same. I.e. Exception on all the nodes except the initial node doing MassIndexing.
      8. It seems like, 'N1's cache-store's persistent state is not getting fetched by 'N2' 'N3' and 'N4', when these node joins joins.
      9. This is indicated by the fact that, FileNotFoundException doesn't comes in 'N1'. It comes in other nodes only (who joined later – like N2, N3 & N4). And size of cache store's '.DAT' files are smaller then 'N1's.

      Require urgent support.
      Attaching the corresponding Infinispan config file (neutrino-hibernatesearch-infinispan.xml)

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  debashish.bharali Debashish Bharali
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: