Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62517

ClusterOperator olm goes Available=False with reason=CatalogdDeploymentCatalogdControllerManager_Deploying or reason=OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying during updates

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • 4.21.0
    • 4.21
    • OLM
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Rejected
    • Rhydon Sprint 278
    • 1
    • In Progress
    • Release Note Not Required
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      A component must not report Available=False during the course of a normal upgrade.

      ClusterOperator olm goes Available=False with reason=CatalogdDeploymentCatalogdControllerManager_Deploying or reason=OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying during updates

      Example job: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.21-e2e-gcp-ovn-upgrade/1972489796022439936

         Sep 29 04:35:47.504 E clusteroperator/olm condition/Available reason/CatalogdDeploymentCatalogdControllerManager_Deploying status/False CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment
      Sep 29 04:35:47.504 - 52s   E clusteroperator/olm condition/Available reason/CatalogdDeploymentCatalogdControllerManager_Deploying status/False CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment
      Sep 29 04:42:35.127 E clusteroperator/olm condition/Available reason/OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status/False OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment
      Sep 29 04:42:35.127 - 12s   E clusteroperator/olm condition/Available reason/OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status/False OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment
       

      Version-Release number of selected component (if applicable):

      The issue was spotted with a 4.21 to 4.21 upgrade test.

          INFO[2025-09-29T02:33:17Z] Using explicitly provided pull-spec for release initial (registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-09-28-082535) INFO[2025-09-29T02:33:17Z] Using explicitly provided pull-spec for release latest (registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-09-29-022535) 

      How reproducible:

      Seems always in the aggregated job  but there is also a green run in a similar test.

      ### failure
      $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.21-e2e-gcp-ovn-upgrade/1972489796022439936/artifacts/e2e-gcp-ovn-upgrade/openshift-e2e-test/artifacts/junit/e2e-monitor-tests__20250929-034333.xml | grep 'clusteroperator/olm should not change condition/Available' -A1
          <testcase name="[Monitor:legacy-cvo-invariants][bz-OLM] clusteroperator/olm should not change condition/Available" time="7014.05639286">
              <failure message="">4 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:&#xA;&#xA;Sep 29 04:35:47.504 E clusteroperator/olm condition/Available reason/CatalogdDeploymentCatalogdControllerManager_Deploying status/False CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment&#xA;Sep 29 04:35:47.504 - 52s   E clusteroperator/olm condition/Available reason/CatalogdDeploymentCatalogdControllerManager_Deploying status/False CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment&#xA;Sep 29 04:42:35.127 E clusteroperator/olm condition/Available reason/OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status/False OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment&#xA;Sep 29 04:42:35.127 - 12s   E clusteroperator/olm condition/Available reason/OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status/False OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment&#xA;&#xA;2 unwelcome but acceptable clusteroperator state transitions during e2e test run.  These should not happen, but because they are tied to exceptions, the fact that they did happen is not sufficient to cause this test-case to fail:&#xA;&#xA;Sep 29 04:36:39.932 W clusteroperator/olm condition/Available reason/AsExpected status/True CatalogdDeploymentCatalogdControllerManagerAvailable: Deployment is available\nOperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Deployment is available (exception: Available=True is the happy case)&#xA;Sep 29 04:42:48.072 W clusteroperator/olm condition/Available reason/AsExpected status/True CatalogdDeploymentCatalogdControllerManagerAvailable: Deployment is available\nOperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Deployment is available (exception: Available=True is the happy case)&#xA;</failure>
      
      ### success
      $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/30308/pull-ci-openshift-origin-main-e2e-gcp-ovn-upgrade/1971564973029068800/artifacts/e2e-gcp-ovn-upgrade/openshift-e2e-test/artifacts/junit/e2e-monitor-tests__20250926-142805.xml | grep 'clusteroperator/olm should not change condition/Available' -A1
          <testcase name="[Monitor:legacy-cvo-invariants][bz-OLM] clusteroperator/olm should not change condition/Available" time="0"></testcase>
          <testcase name="[Monitor:legacy-cvo-invariants][bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available" time="0"></testcase>

      Steps to Reproduce:

          1. Run the aggregated job above
          2.
          3.
          

      Actual results:

      co/olm goes Available=True during the upgrade test.

      Expected results:

      co/olm stays Available=True during the upgrade test.

      Additional info:

      The failures were taken from 4.21 to 4.21 upgrade test. It could go with earlier versions too.

              rhn-support-jiazha Jian Zhang
              hongkliu Hongkai Liu
              None
              None
              Jian Zhang Jian Zhang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: