Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-17187

infra-operator-controller-manager out of memory error

XMLWordPrintable

    • 1
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • openstack-operator-container-1.0.12-7
    • Impediment
    • rhos-conplat-core-operators
    • None
    • Hide

      Ticket created for Permanent fix- RHOSO

      openstack-operators infra-operator-controller-manager-6cff8cbd66-kht6n 1/2 OOMKilled 3 (53s ago) 2m12s 172.29.23.35 az3-worker3.os.ae-auh1-1.core42.systems <none> <none>

      Workaround applied as below:

      1. oc describe csv openstack-operator.v1.0.9 -n openstack-operators 
        root@provision:az3 (main)
      2. oc get deployment -n openstack-operators
        NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
        barbican-operator-controller-manager              1/1     1            1           4d17h
        cinder-operator-controller-manager                1/1     1            1           4d17h
        designate-operator-controller-manager             1/1     1            1           4d17h
        glance-operator-controller-manager                1/1     1            1           4d17h
        heat-operator-controller-manager                  1/1     1            1           4d17h
        horizon-operator-controller-manager               1/1     1            1           4d17h
        infra-operator-controller-manager                 1/1     1            1           4d17h
        ironic-operator-controller-manager                1/1     1            1           4d17h
        keystone-operator-controller-manager              1/1     1            1           4d17h
        manila-operator-controller-manager                1/1     1            1           4d17h
        mariadb-operator-controller-manager               1/1     1            1           4d17h
        neutron-operator-controller-manager               1/1     1            1           4d17h
        nova-operator-controller-manager                  1/1     1            1           4d17h
        octavia-operator-controller-manager               1/1     1            1           4d17h
        openstack-baremetal-operator-controller-manager   1/1     1            1           4d17h
        openstack-operator-controller-manager             1/1     1            1           4d17h
        openstack-operator-controller-operator            0/0     0            0           4d17h
        ovn-operator-controller-manager                   1/1     1            1           4d17h
        placement-operator-controller-manager             1/1     1            1           4d17h
        rabbitmq-cluster-operator-manager                 1/1     1            1           4d17h
        swift-operator-controller-manager                 1/1     1            1           4d17h
        telemetry-operator-controller-manager             1/1     1            1           4d17h
        test-operator-controller-manager                  1/1     1            1           4d17h
      Show
      Ticket created for Permanent fix- RHOSO openstack-operators infra-operator-controller-manager-6cff8cbd66-kht6n 1/2 OOMKilled 3 (53s ago) 2m12s 172.29.23.35 az3-worker3.os.ae-auh1-1.core42.systems <none> <none> Workaround applied as below: oc describe csv openstack-operator.v1.0.9 -n openstack-operators  root@provision:az3 (main) oc get deployment -n openstack-operators NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE barbican-operator-controller-manager              1/1     1            1           4d17h cinder-operator-controller-manager                1/1     1            1           4d17h designate-operator-controller-manager             1/1     1            1           4d17h glance-operator-controller-manager                1/1     1            1           4d17h heat-operator-controller-manager                  1/1     1            1           4d17h horizon-operator-controller-manager               1/1     1            1           4d17h infra-operator-controller-manager                 1/1     1            1           4d17h ironic-operator-controller-manager                1/1     1            1           4d17h keystone-operator-controller-manager              1/1     1            1           4d17h manila-operator-controller-manager                1/1     1            1           4d17h mariadb-operator-controller-manager               1/1     1            1           4d17h neutron-operator-controller-manager               1/1     1            1           4d17h nova-operator-controller-manager                  1/1     1            1           4d17h octavia-operator-controller-manager               1/1     1            1           4d17h openstack-baremetal-operator-controller-manager   1/1     1            1           4d17h openstack-operator-controller-manager             1/1     1            1           4d17h openstack-operator-controller-operator            0/0     0            0           4d17h ovn-operator-controller-manager                   1/1     1            1           4d17h placement-operator-controller-manager             1/1     1            1           4d17h rabbitmq-cluster-operator-manager                 1/1     1            1           4d17h swift-operator-controller-manager                 1/1     1            1           4d17h telemetry-operator-controller-manager             1/1     1            1           4d17h test-operator-controller-manager                  1/1     1            1           4d17h
    • OSPK8S Sprint 1
    • 1
    • Important

      To Reproduce Steps to reproduce the behavior:

      Change back the replicas count from 0 to 1 in csv openstack-operator.v1.0.9 -n openstack-operators. This operation will overwrite the modified memory limits/requests to default values which causes the issue.

      Expected behavior

      • infra-operator-controller-manager pod should run without any error

      Screenshots

      • Default values:
        --------------
        limits:
          cpu: 500m
          memory: 256Mi
        requests:
          cpu: 10m
          memory: 128Mi
          

      Updated values:
      ---------------
      resources:
        limits:
          cpu: 500m
          memory: 1Gi
        requests:
          cpu: 10m
          memory: 512Mi

      Device Info (please complete the following information):

        • # oc get openstackversion
          NAME                    TARGET VERSION      AVAILABLE VERSION   DEPLOYED VERSION
          openstackcontrolplane   18.0.7-20250408.2   18.0.7-20250408.2   18.0.7-20250408.2

      Bug impact

      • Pod will crash

      Known workaround

      oc describe csv openstack-operator.v1.0.9 -n openstack-operators

      1. update the memory/request values as shown above
      2. update the deployment replica count to 0

      3. oc describe deployment infra-operator-controller-manager -n openstack-operators

      modified the vlaues for infra-operator-controller manager as shown below.

          Limits:
            cpu:     500m
            memory:  1Gi
          Requests:
            cpu:      10m
            memory:   512Mi

      Additional context

      • Expecting permanent fix, any restart at operator level can change the values back to default & the pods will crash again.

        1. image-2025-06-03-12-16-14-713.png
          26 kB
          Suraj Rajasekharan Nair
        2. openstack-operator-manager.yml
          22 kB
          Suraj Rajasekharan Nair

              rhn-support-mschuppe Martin Schuppert
              rh-ee-suranair Suraj Rajasekharan Nair (Inactive)
              rhos-conplat-core-operators
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: