This is the conversation captured between, rareddy, shawkins and sbrooks, for the refreshMatView system procedure functionality and how it behaves currently and proposed modfication.
(10:56:48 AM) rareddy: sbrooks: I read through the code; I understand how it does now;
(10:58:26 AM) rareddy: sbrooks: when invalidate == true; the distributed cache is set "invalidate", so all nodes have "invalidate" state. However, the node that received the "refreshMatView" continues on to load the new contents
(10:59:04 AM) rareddy: sbrooks: if the other nodes looking for the refreshed data before the load finishes they get get blocked.
(10:59:29 AM) rareddy: sbrooks: if the refresh finishes, then they get the new data
(11:01:06 AM) rareddy: sbrooks: if invalidate == false, then distributed cache does not get changed, and all nodes will keep serving the old data until the new data is refreshed, once the new data is available it is time stamped to the load time.
(11:12:42 AM) rareddy: sbrooks: now, how the other nodes behave after load, I can not seem to determine correctly. I think they just refresh the data from other node; may be shawkins can confirm
(11:17:34 AM) shawkins: rareddy: with invalidate false when the load finishes, then all nodes should pick up the updated state based upon comparing the local timestamp to the remote (see registerQuery in TempTableDataManager)
(11:21:10 AM) shawkins: rareddy: with invalidate true, things are more complicated. the remote nodes are not coordinated with respect to the load. there's a comment to that effect "//TODO: coordinate a distributed load". so with invalidate true you may load the data multiple times from the source for each node that is queried during the load in an invalid state.
(11:24:28 AM) rareddy: shawkins: but in the invalidate = true case, I see the update to the distributed cache key, would that not everybody to start loading if they see the state as loading?
(11:25:13 AM) rareddy: shawkins: let me re-phrase
(11:27:05 AM) rareddy: shawkins: since the distributed cache is invalidated, the other nodes in cache see that, the cache is being loaded at another node, can they not stop from doing their own load?
(11:29:00 AM) shawkins: rareddy: it's not a question of whether the load could be coordinated. it was implemented without coordination for simplicity
(11:30:04 AM) rareddy: shawkins: so, we should really recommend the "invalidate=false" until 5.2 to keep the data in sync then would you agree?
(11:30:13 AM) shawkins: rareddy: one possible path would be to expose jbosscache distributed node locking
(11:30:27 AM) shawkins: rareddy: no. 5.2 would not be any different
(11:32:18 AM) rareddy: shawkins: so what is the value prop for the invalidate=true? immediate invalidation?
(11:32:52 AM) shawkins: rareddy: correct you ensure that stale values are no longer used
(11:33:32 AM) rareddy: shawkins: but in the same case, it serves that same stale data until the load is finished
(11:33:49 AM) rareddy: or does it block?
(11:33:56 AM) shawkins: rareddy: ?
(11:34:45 AM) rareddy: shawkins: when I issue with invalidate = true; the load takes let's say 5 min; then issue a query during that intervel, does the query block?
(11:35:11 AM) shawkins: rareddy: on the same node it blocks, on a remote node it will initiate another load
(11:37:24 AM) rareddy: shawkins: ah! ok. so going back to my earlier question, using the invalidate=false is best way for them to keep the data in sync
(11:39:05 AM) shawkins: rareddy: if you are ok with stale data, then yes. however the load is still not coordinated. if another node is issued a refresh during the load, then it too will attempt a load
(11:41:06 AM) sbrooks: rareddy: My test is that the refreshes can be controlled, so i will load a cache onto two nodes then refresh one with invalidate=false, after it's done I will query the second node as see which cache it returns
(11:41:34 AM) rareddy: shawkins: to keep out of "out of sync" then if they issue the "invalidate=false" on one node; then deal with staleness factor, it should keep the results in sync
(11:43:40 AM) sbrooks: rareddy: and they should be fine with the staleness factor so all nodes continue to return results. The only thing that needs to happen is once the refresh completes on one node, all nodes should not use the older version for any future queries
(11:47:19 AM) rareddy: shawkins: based on the feedback sbrooks giving on this subject, do you think we should pursue the "co-ordinated" loads then?
(11:47:54 AM) shawkins: rareddy: that's why there's a TODO
(11:48:01 AM) rareddy: shawkins: may be for 5.2?
(11:48:43 AM) rareddy: shawkins: cool, let 's see what we can do, then
(11:48:54 AM) shawkins: rareddy: we have more implementation options in 5.2 as we can expose jgroups functionality directly
(11:50:40 AM) rareddy: shawkins: you mean for co-ordinate based in events?
(11:52:12 AM) shawkins: rareddy: or jgroup distributed locks
(11:52:30 AM) shawkins: rareddy: whatever makes the most sense
(11:53:23 AM) rareddy: shawkins: ok, do not much about either ones; do we any JIRA to cover this, I can enter one