Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-53442

[OLMv0] should monitor the Deployment resource in OLM clusteroperator

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.19.0
    • OLM
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      I split this issue from https://issues.redhat.com/browse/OCPBUGS-53161 . The olm-operator deployment wasn't ready, however, the OLM co didn't report this issue.

      jiazha-mac:~ jiazha$ omg get deploy 
      NAME                    READY  UP-TO-DATE  AVAILABLE  AGE
      catalog-operator        1/1    1           1          1594d
      olm-operator            0/1    1           0          1594d
      
      jiazha-mac:~ jiazha$ omg get co
      NAME                        VERSION  AVAILABLE  PROGRESSING  DEGRADED  SINCE
      ...
      operator-lifecycle-manager                4.15.44  True       False        False     1594d
      operator-lifecycle-manager-catalog        4.15.44  True       False        False     1594d
      operator-lifecycle-manager-packageserver  4.15.44  True       False        False     277d    

      The reason is that we didn't add Deployment resources to the CO. As follows, we only add the `group: ""`, which contains `pods, services, namespaces, configmaps, etc` core resources.

       

      jiazha-mac:~ jiazha$ oc get co  operator-lifecycle-manager -o yaml
      apiVersion: config.openshift.io/v1
      kind: ClusterOperator
      ...
        relatedObjects:
        - group: operators.coreos.com
          name: packageserver
          namespace: openshift-operator-lifecycle-manager
          resource: clusterserviceversions
      
      jiazha-mac:~ jiazha$ oc get co  operator-lifecycle-manager-catalog -o yaml
      apiVersion: config.openshift.io/v1
      kind: ClusterOperator
      ...
        relatedObjects:
        - group: ""
          name: openshift-operator-lifecycle-manager
          resource: namespaces
      
      jiazha-mac:~ jiazha$ oc get co  operator-lifecycle-manager-packageserver -o yaml
      apiVersion: config.openshift.io/v1
      kind: ClusterOperator
      metadata:
      ...
        relatedObjects:
        - group: ""
          name: openshift-operator-lifecycle-manager
          resource: namespaces
        - group: operators.coreos.com
          name: packageserver
          namespace: openshift-operator-lifecycle-manager
          resource: clusterserviceversions
       

      The pod was in the `Running` state, so it could not catch this CrashLoopBackOff issue.

       

       

        containerStatuses:
        - containerID: cri-o://5fc05c9f987fb1e770c8f4801607c209868b65f0f198e89a98b7f5f39652fda7
          image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4b11bb4b7da75ab1dca818dcb1f59c6e0cef5c8c8f9bea2af4e353942ad91f29
          imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4b11bb4b7da75ab1dca818dcb1f59c6e0cef5c8c8f9bea2af4e353942ad91f29
          lastState:
            terminated:
              containerID: cri-o://5fc05c9f987fb1e770c8f4801607c209868b65f0f198e89a98b7f5f39652fda7
              exitCode: 0
              finishedAt: '2025-03-14T06:44:50Z'
              reason: Completed
              startedAt: '2025-03-14T06:44:49Z'
          name: olm-operator
          ready: false
          restartCount: 6
          started: false
          state:
            waiting:
              message: back-off 5m0s restarting failed container=olm-operator pod=olm-operator-c49ddd47b-x7dtc_openshift-operator-lifecycle-manager(1f907dd8-466c-496e-9715-d07e4d762a36)
              reason: CrashLoopBackOff
        hostIP: 10.41.70.36
        phase: Running
        podIP: 10.134.0.26 

      So, it's better to monitor the `Deployment` resource so that the end user can notify it timely. Like below:

       

       

      relatedObjects:
        - group: apps
          resource: deployments
          namespace: openshift-operator-lifecycle-manager
          name: catalog-operator
        - group: apps
          resource: deployments
          namespace: openshift-operator-lifecycle-manager
          name: olm-operator
        - group: apps
          resource: deployments
          namespace: openshift-operator-lifecycle-manager
          name: package-server-manager
        - group: apps
          resource: deployments
          namespace: openshift-operator-lifecycle-manager
          name: packageserver

       

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          always

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

          

              rh-ee-cchantse Catherine Chan-Tse
              rhn-support-jiazha Jian Zhang
              None
              None
              Jian Zhang Jian Zhang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: