Uploaded image for project: 'Connectivity Link'
  1. Connectivity Link
  2. CONNLINK-678

Need to remove the limits from the RHCL Operators

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      I had a case where customer was getting OOMkilled errors for the RHCL operators. Because there are high number of objects. 

      During the check it was found the operator has limit defined

       resources: 
         limits: 
           cpu: 200m 
           memory: 300Mi 
         requests: 
            cpu: 200m 
            memory: 200Mi 

      Example usage of customer cluster:

      limitador-operator-controller-manager-5687dfb9c4-hhrdt       manager                              3m           391Mi 

      Inside the pod as well we have checked and found & gradually decreased to 391Mi. 

      cat /proc/1/status | grep -E 'VmRSS|VmSize|VmSwap|VmHWM'
      VmSize:  1865492 kB
      VmHWM:    614792 kB
      VmRSS:    554828 kB <--- Here it was using 542MB
      VmSwap:        0 kB 

      So we have changed the limits using the subscription.

      spec:
      <cropped>
        config:
          resources: 
            requests:
              memory: "200Mi"
              cpu: "200m"
            limits:
              memory: "600Mi"
              cpu: "600m"
      <cropped> 

      So, if possible can we remove the Limits for the operators, so customer will not face OOMKilled issue. 

      limitador-operator-controller-manager-64d9f88f6d-vg4jq       0/1     CrashLoopBackOff   8          20m
           lastState:
            terminated:
              containerID: cri-o://27816824fa948a78f619f82c7e0688ebf44bbaefcdf30a7d20d3e86e04697996
              exitCode: 137
              finishedAt: "2025-12-17T16:44:16Z"
              reason: OOMKilled
              startedAt: "2025-12-17T16:43:51Z"
          name: manager
          ready: false
          restartCount: 8

      Indeed there were high object counts and we can expect this kind of utilization. 

      5242 secrets
      5998 configmaps
      1624 deployments
      3969 operators.coreos.com
      3592 services
      2554 pods
      2078 rolebindings
      1796 endpointslices
      1409 networking.istio.io
      542 networkpolicies 

      But what if customer increase more objects then they will see issue again. So if possible limits can be removed from future prospective. 

              pbrookes@redhat.com Philip Brookes
              rhn-support-vsolanki Vimal Solanki
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: