Uploaded image for project: 'Red Hat build of Keycloak'
  1. Red Hat build of Keycloak
  2. RHBK-3932

Increased memory usage due to leaking KeycloakSession instances [GHI#43744]

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      Before reporting an issue

      [x] I have read and understood the above terms for submitting issues, and I understand that my issue may be closed without action if I do not follow them.

      Area

      admin/api

      Describe the bug

      We recently updated our Keycloak from Version 25.0.6 to 26.4.1. After the update, we noticed a increased Heap Memory Usage and OutOfMemoryErrors in Keycloak.

      Version

      26.4.1

      Regression

      [x] The issue is a regression

      Expected behavior

      no OOM Errors

      Actual behavior

      <img width="3732" height="1660" alt="Image" src="https://github.com/user-attachments/assets/37219936-dae2-4b8f-9acd-99607fddf558" />

      In this screenshot you can see the heap memory of our Keycloak Cluster with 3 replicas over a time span of 30 days. Each replica is running in a Docker container with a memory limit of 1.5 G. We are also using the parameter -XX:MaxRAMPercentage=70, so that the max heap is at 1.05G.

      I marked some interesting Points in the screenshot:

      1. the update was deployed, heap looks good before but starts to increase after
      2. heap memory reaches the max limit, and OOM Errors occur, users might experience long request times or can't even access our website any more. After some time the docker container gets unhealthy is is restarted automatically
      3. not all replicas are affected, sometimes they can even recover and drop the memory usage (the blue line)
      4. we decided to test a increase of the container limit to 2 G, so having 1.4G max heap memory
      5. the OOM still occurs

      Since the last 3 days we have a cronjob which restarts Keycloak every day, to avoid the OOM, but of course this is not a valid solution.

      We also noticed, that the heap memory mostly increases during the night. We have some cronjobs running every night to synchronize user data, roles etc. between Keycloak and our app using the Keycloak API. I should also mention that we have ca. 170 Realms in this cluster.

      How to Reproduce?

      We can only reproduce it in our production environment. In our Test environment which has less realms / users / activity, the problem does not occur.

      Anything else?

      At some point i also made a Heap Dump to get some insights about the memory usage. It seems like the memory is mostly used for Cache and QuarkusKeycloakSession:

      <img width="3658" height="1605" alt="Image" src="https://github.com/user-attachments/assets/283671b2-6f68-44e9-89e5-65722b7a6675" />
      I also noticed one especially large QuarkusKeycloakSession:

      <img width="3648" height="1723" alt="Image" src="https://github.com/user-attachments/assets/65ca790f-bbd1-443a-99fc-991890dde010" />

      <img width="3666" height="897" alt="Image" src="https://github.com/user-attachments/assets/bdb96a67-2e0b-4a78-98e3-d36aa2e06f39" />

      Please let me know if you need any more information to investigate the issue.

              Unassigned Unassigned
              pvlha Pavel Vlha
              Keycloak SRE
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: