Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-63136

The master node in MC cluster report insufficent memory due to MAPI controller used too much memory

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • CLOUD Sprint 278
    • 1
    • In Progress
    • Bug Fix
    • Hide
      * Cause - The controller creates and deletes a file with a random name when setting up authentication to AWS
      * Consequence - The controller continuously allocated more memory
      * Fix - Using the same file name instead of a random one
      * Result - The kernel re-uses the dentry instead of requesting a new one for each file
      Show
      * Cause - The controller creates and deletes a file with a random name when setting up authentication to AWS * Consequence - The controller continuously allocated more memory * Fix - Using the same file name instead of a random one * Result - The kernel re-uses the dentry instead of requesting a new one for each file
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-38759. The following is the description of the original issue:

      Description of problem:
      Primary received ExtremelyHighIndividualControlPlaneMemory
      Alert https://redhat.pagerduty.com/incidents/Q2U3B5WD4300DY on a hypershift MC cluster hs-mc-i0npt9ce0.

      Cluster name: hs-mc-i0npt9ce0
      Cluster ID: 32a39ea3-1c2c-4786-b991-b04742ad5fdf

      A master node is experience a extramely high memory usage

        ip-10-0-0-227.ap-southeast-4.compute.internal    hs-mc-i0npt9ce0-qqqtk-master-bg7s8-0                         🏛  master  20d            1487m (9%)   13572Mi (23%)   
          ip-10-0-1-151.ap-southeast-4.compute.internal    hs-mc-i0npt9ce0-qqqtk-master-899lq-1                         🏛  master  20d            654m (4%)    57859Mi (99%)🔥 
          ip-10-0-2-126.ap-southeast-4.compute.internal    hs-mc-i0npt9ce0-qqqtk-master-xh7tl-2                         🏛  master  20d            1162m (7%)   17166Mi (29%)   
      

      MAPI controller took over 46G of memory on this node.
      The problematic pod link in dynatrace is
      https://zwz85475.apps.dynatrace.com/ui/apps/dynatrace.classic.technologies/#processdetails;gtf=2024-08-21T11:00:00+12:00%20to%202024-08-21T13:00:00+12:00;gf=all;id=PROCESS_GROUP_INSTANCE-E552062AFBF57B4A

      No obvious error was found in the pod. Suspect there are memory leak in the MAPI controller due to the memory usage is going up gradually.

      I have to restart the problematic pod to lower the memory usage. No idea how to reproduce the issue.

              rh-ee-cschlott Christian Schlotter
              tkong-ocm Tony Kong
              None
              None
              Zhaohua Sun Zhaohua Sun
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: