Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-2950

In distributed mode cache store data should be read through the main data owner (vs directly from the store)



    Dist cache with a cache store (shared or not), k owned by {N1, N2}. k is read on N3. What currently happens at this stage, if k is not present in N3's memory (likely unless L1 is configured), the N3's cache store is queried and data is loaded from there. This has several drawbacks:

    • the data might already be in the memory of the owner node (N1,N2) so reading it from the disk is highly inefficient. Especially for hot data: data requested from various nodes at the same time (see also mailing list discussion around lucene query performance depending on this)
    • if this is a local cache store, it might contain stale data which would be returned to the user
    • for async configured cache store this would result in dirty reads, given that a change might be in the async store's memory but not in the store at the moment when it is in read by N3. (Note that using async stores still leaves place to inconsistencies when a node leaves, e.g. because of node crashing before managing to flush the async store.)

    This JIRA is about changing the distribution mode: when asked for a specific key, a node would only touch a cache store if it is an owner of that key, otherwise would first go to the main owner of the key to read the value from there. The ClusterCacheLoader should be deprecated as well.




          wburns@redhat.com Will Burns
          sgrinove Sanne Grinovero
          2 Vote for this issue
          8 Start watching this issue