Uploaded image for project: 'Red Hat build of Keycloak'
  1. Red Hat build of Keycloak
  2. RHBK-3020

JGroups errors when running a containerized Keycloak in Strict FIPS mode and with Istio [GHI#39454]

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      Before reporting an issue

      [x] I have read and understood the above terms for submitting issues, and I understand that my issue may be closed without action if I do not follow them.

      Area

      infinispan

      Describe the bug

      When running a containerized Keycloak on top of the UDS Core in Strict FIPS mode, we noticed the following errors:

      2025-05-05 09:06:56,052 WARNING [org.bouncycastle.jsse.provider.ProvTlsClient] (TQ-Bundler-7,keycloak-2-19472) [client #2 @28736426] raised fatal(2) internal_error(80) alert: Failed to read record: java.net.SocketException: Connection reset
      

      at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:318)
      at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
      at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
      at java.base/java.net.Socket$SocketInputStream.read(Socket.java:1099)
      at org.bouncycastle.tls.RecordStream$Record.fillTo(RecordStream.java:442)
      at org.bouncycastle.tls.RecordStream$Record.readHeader(RecordStream.java:481)
      at org.bouncycastle.tls.RecordStream.readRecord(RecordStream.java:216)
      at org.bouncycastle.tls.TlsProtocol.safeReadRecord(TlsProtocol.java:879)
      at org.bouncycastle.tls.TlsProtocol.blockForHandshake(TlsProtocol.java:427)
      at org.bouncycastle.tls.TlsClientProtocol.connect(TlsClientProtocol.java:88)
      at org.bouncycastle.jsse.provider.ProvSSLSocketDirect.startHandshake(ProvSSLSocketDirect.java:425)
      at org.bouncycastle.jsse.provider.ProvSSLSocketDirect.startHandshake(ProvSSLSocketDirect.java:406)
      at org.jgroups.blocks.cs.TcpConnection.connect(TcpConnection.java:98)
      at org.jgroups.blocks.cs.TcpConnection.connect(TcpConnection.java:86)
      at org.jgroups.blocks.cs.BaseServer.getConnection(BaseServer.java:353)
      at org.jgroups.blocks.cs.BaseServer.getConnection(BaseServer.java:317)
      at org.jgroups.blocks.cs.BaseServer.send(BaseServer.java:257)
      at org.jgroups.protocols.TCP.send(TCP.java:109)
      at org.jgroups.protocols.BasicTCP.sendUnicast(BasicTCP.java:152)
      at org.jgroups.protocols.TP.sendTo(TP.java:1416)
      at org.jgroups.protocols.TP.doSend(TP.java:1396)
      at org.jgroups.protocols.BaseBundler.sendSingleMessage(BaseBundler.java:125)
      at org.jgroups.protocols.BaseBundler.sendBundledMessages(BaseBundler.java:108)
      at org.jgroups.protocols.TransferQueueBundler.run(TransferQueueBundler.java:134)
      at java.base/java.lang.Thread.run(Thread.java:1583)

      
      

      The other relevant (but anonymized log entries are here):

      # JGroups start with Kubernetes stack and mTLS
      
      2025-05-05 09:06:44,672 INFO  [org.keycloak.quarkus.runtime.storage.infinispan.CacheManagerFactory] (main) JGroups Encryption enabled (mTLS).
      2025-05-05 09:06:45,138 INFO  [org.keycloak.infinispan.module.certificates.CertificateReloadManager] (main) Starting JGroups certificate reload manager
      2025-05-05 09:06:45,526 INFO  [org.infinispan.CONTAINER] (main) ISPN000556: Starting user marshaller 'org.infinispan.commons.marshall.ImmutableProtoStreamMarshaller'
      2025-05-05 09:06:46,033 INFO  [org.infinispan.CLUSTER] (main) ISPN000078: Starting JGroups channel `ISPN` with stack `kubernetes`
      
      # Below shows that each node forms a singleton cluster:
      
      2025-05-05 09:06:56,059 INFO  [org.infinispan.CLUSTER] (main) ISPN000094: Received new cluster view for channel ISPN: [keycloak-2-19472|0] (1) [keycloak-2-19472]
      
      # FIPS configuration
      
      2025-05-05 11:59:49,641 TRACE [org.keycloak.common.crypto.CryptoIntegration] (main) Java security providers: [
       KC(BCFIPS version 2.0 Approved Mode, FIPS-JVM: enabled) version 1.0 - class org.keycloak.crypto.fips.KeycloakFipsSecurityProvider,
       BCFIPS version 2.0 - class org.bouncycastle.jcajce.provider.BouncyCastleFipsProvider,
       BCJSSE version 2.0019 - class org.bouncycastle.jsse.provider.BouncyCastleJsseProvider,
       SunPKCS11-NSS-FIPS version 21 - class sun.security.pkcs11.SunPKCS11,
       SUN version 21 - class sun.security.provider.Sun,
       SunEC version 21 - class sun.security.ec.SunEC,
       SunJSSE version 21 - class sun.security.ssl.SunJSSE,
       SunJCE version 21 - class com.sun.crypto.provider.SunJCE,
       SunRsaSign version 21 - class sun.security.rsa.SunRsaSign,
       XMLDSig version 21 - class org.jcp.xml.dsig.internal.dom.XMLDSigRI,
      ]
      

      In UDS configuration, we have no control over the hardware nor the Kubernetes Node Operating System. This particular test was performed on EKS with a non-FIPS Kubernetes Node. However, we received information from the field that this also happens on RHEL (FIPS-enabled) hosts as well.

      Version

      26.2.2

      Regression

      [ ] The issue is a regression

      Expected behavior

      Keycloak should run seamlessly JGroups and Infinispan in FIPS mode (with both JVM FIPS enabled/disabled and Strict/Non-Strict Mode)

      Actual behavior

      Keycloak throws errors and JGroups doesn't form a cluster

      How to Reproduce?

      We have a stable reproducer in UDS. Please follow:

      1. Install all required software from https://uds.defenseunicorns.com/getting-started/basic-requirements/
      2. Install all required software from https://uds.defenseunicorns.com/getting-started/install-and-deploy-uds/
      3. Clone https://github.com/defenseunicorns/uds-core/pull/1518
      4. Run uds run test-uds-core-ha

      The above commands will spin up a k3d cluster with 3 clustered nodes on top of it. By inspecting the logs, you will see the error.

      I believe this issue can be reproduced in other ways. The relevant bits seems to be deploying Keycloak using containers in FIPS (Strict) Mode. Our StatefulSet can be found here: https://gist.github.com/slaskawi/7267900fd1d774e3a93c2e83eca72817

      Anything else?

      This issue might be a regression but we don't know for how long.

              Unassigned Unassigned
              pvlha Pavel Vlha
              Keycloak SRE
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: