-
Bug
-
Resolution: Won't Do
-
Major
-
4.2.10
-
None
-
False
-
None
-
False
-
Configuration: a cluster containing three VMs, three gossip routers (one per VM), and each VM has approxiately 40 JVM nodes. The cluster is using the CENTRAL_LOCK2 protocol with the lock service. The MERGE3 protocol is used to resolve sub-views.
Each VM:
MemTotal: 164873940 kB
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
stepping : 0
microcode : 0x2006b06
cpu MHz : 2294.609
cache size : 25344 KB
16 processors
With this configuration nodes may leave the cluster after performing work, and new nodes may join and start work. Each node is performing locking operations where the lock IDs must be obtained in order: DWHITE_ISA_ENVELOPE, DWHITE_GS_ENVELOPE-1, DWHITE_ST_ENVELOPE-1. A node cannot acquire another lock in the chain without having first acquired the previous lock ID in the list (i.e. ISA -> GS -> ST).
Whenever a node leaves or joins the cluster this action produces a View change and requires the Coordinator to use the MERGE3 and CENTRAL_LOCK2 protocols in order to resolve the sub-views and rebuild the lock state table. With a View change the Coordinator itself may also change to a different node in the cluster.
Unfortunately, the lock state table is not always re-built correctly by the Coordinator. When the lock state is rebuilt incorrectly, there will be a discrepancy between the server lock table and client lock tables and the nodes will be "deadlocked" with each node waiting on a lock that another node does not own.
The logs from our QA show the following deadlock and table discrepancies:
ISA_ENVELOPE_DWHITE: jazzbox02008-51093 (newnode5), node client table has ISA_ENVELOPE_DWHITE, waiting on GS_ENVELOPE_DWHITE-1 (good)
GS_ENVELOPE_DWHITE-1: jazzbox02008-24988 (newnode9), node client table has nothing, waiting on ISA_ENVELOPE_DWHITE (bad)
ST_ENVELOPE_DWHITE-1: jazzbox02008-24988 (newnode9), node client table has nothing, waiting on ISA_ENVELOPE_DWHITE (bad)
Checking the respective client lock tables of the other nodes in the cluster reveals more discrepancies with the server lock table:
client locks show jazzbox02008-57169 (newnode7) has GS_ENVELOPE_DWHITE-1
client locks show no ownership for ST_ENVELOPE_DWHITE-1
The following warnings are noted in the jgroups logs. For example:
jazzbox02008-51093: lock ISA_ENVELOPE_DWHITE requested by jazzbox02008-24988::104 is already present
jazzbox02008-57169: timeout waiting for RELEASE_LOCK_OK response for lock GS_ENVELOPE_DWHITE-1 (id=6935, locked=true, owner=jazzbox02008-57169::135)
JGRP000011: jazzbox02008-4353: dropped message batch from non-member jazzbox02007-26179 (view=[jazzbox02008-51093|137] (6) [jazzbox02008-51093, jazzbox02007-34942, jazzbox02008-24988, jazzbox02008-4353, jazzbox02008-44059, jazzbox02008-29636])
jazzbox02008-60504: failed sending message to <ip_address>: java.nio.channels.ClosedSelectorException
jazzbox02007-26179: terminating after 10 unsuccessful LEAVE attempts (waited 20002 ms):
I hope we can determine why the lock table is sometimes not rebuilt properly. My guess is that the ordering of the View change messages coupled with the timestamps of the lock/unlock requests are key factors to resolving the problem.
Another feature that would be very helpful is the capability to specify a timeout for a lock. The JGroups Coordinator would be responsible for determining when a lock times out and would revoke it.
See the attached log files for a captured case where the lock table is not rebuilt correctly.