XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 5.3.19, 5.5.0, 5.4.9
Affects Version/s: 5.3.18
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Git Pull Request:
https://github.com/belaban/JGroups/pull/921
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

In environments where customers deploy Keycloak via AWS Fargate, we see situations where two Keycloak instances in a cluster register themselves as an Infinispan coordintor in the jgroups_ping table.

The following examples are from a local docker compose example.

kc1-1  | 2025-07-18 19:20:34,313 INFO  [org.jgroups.protocols.pbcast.GMS] (main) kc1-35658: no members discovered after 2 ms: creating cluster as coordinator
kc2-1  | 2025-07-18 19:20:34,314 INFO  [org.jgroups.JChannel] (main) local_addr: 6fcea7e2-3e50-4239-88a2-e3a529cdd27a, name: kc2-31834
kc2-1  | 2025-07-18 19:20:34,321 INFO  [org.jgroups.protocols.FD_SOCK2] (main) server listening on *:57800
kc2-1  | 2025-07-18 19:20:34,324 INFO  [org.jgroups.protocols.pbcast.GMS] (main) kc2-31834: no members discovered after 2 ms: creating cluster as coordinator
kc1-1  | 2025-07-18 19:20:34,325 INFO  [org.infinispan.CLUSTER] (main) ISPN000094: Received new cluster view for channel ISPN: [kc1-35658|0] (1) [kc1-35658]
kc1-1  | 2025-07-18 19:20:34,327 INFO  [org.keycloak.jgroups.certificates.CertificateReloadManager] (main) Reloading JGroups Certificate
kc2-1  | 2025-07-18 19:20:34,336 INFO  [org.infinispan.CLUSTER] (main) ISPN000094: Received new cluster view for channel ISPN: [kc2-31834|0] (1) [kc2-31834]

This situation heals itself after a few seconds / (sometimes) minutes:

kc1-1  | 2025-07-18 19:21:22,368 INFO  [org.infinispan.CLUSTER] () ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[kc1-35658|1] (2) [kc1-35658, kc2-31834], 2 subgroups: [kc1-35658|0] (1) [kc1-35658], [kc2-31834|0] (1) [kc2-31834]
kc1-1  | 2025-07-18 19:21:22,368 INFO  [org.keycloak.jgroups.certificates.CertificateReloadManager] () Reloading JGroups Certificate
kc1-1  | 2025-07-18 19:21:22,372 INFO  [org.infinispan.CLUSTER] () ISPN100000: Node kc2-31834 joined the cluster
kc1-1  | 2025-07-18 19:21:22,372 INFO  [org.infinispan.CLUSTER] () ISPN100000: Node kc2-31834 joined the cluster
kc2-1  | 2025-07-18 19:21:23,380 INFO  [org.infinispan.CLUSTER] () ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[kc1-35658|1] (2) [kc1-35658, kc2-31834], 2 subgroups: [kc1-35658|0] (1) [kc1-35658], [kc2-31834|0] (1) [kc2-31834]
kc2-1  | 2025-07-18 19:21:23,380 INFO  [org.keycloak.jgroups.certificates.CertificateReloadManager] () Reloading JGroups Certificate
kc2-1  | 2025-07-18 19:21:23,385 INFO  [org.infinispan.CLUSTER] () ISPN100000: Node kc1-35658 joined the cluster
kc2-1  | 2025-07-18 19:21:23,386 INFO  [org.infinispan.CLUSTER] () ISPN100000: Node kc1-35658 joined the cluster

The situation is worse when 4 nodes are started simultaneously, as we also see "no physical address, dropping message" in the logs.

Proposed solution

Without transaction in the JDBC_PING protocol it's not possible to fully prevent above scenarios, but we can reduce the chances of it happening by doing the following:

1. On initial discovery read from DB, write local address and then re-read DB table until coordinator exists or subsequent ping data is the same as initial DB query. You can't weave two threads doing this without one thread reading both coordinator entries during discovery, so we are safe without adding additional database measures.
2. When remove_all_data_on_view_change=true only remove addresses that are not part of the current view
3. Call addDiscoveryResponseToCaches on view change to prevent "no physical address, dropping message".

links to

Keycloak Issue

Assignee:: Ryan Emerson

Reporter:: Ryan Emerson

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/08/05 1:19 PM

Updated:: 2025/08/22 8:18 AM

Resolved:: 2025/08/22 8:18 AM

Details

Description

Proposed solution

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates