Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62626

cluster operator image-registry reported Progressing=True with reason= NodeCADaemonUnavailable::Ready or reason=DeploymentNotCompleted for a node reboot

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.20
    • Image Registry
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      co/image-registry reported Progressing=True only for a node reboot which should not happened during a normal cluster upgrade.

      Job example

      {  16 (out of 34) unexpected clusteroperator state transitions while machine-config is progressing during the upgrade window from 2025-09-26T01:29:14Z to 2025-09-26T02:39:28Z.  These did not match any known exceptions, so they cause this test-case to fail:
      
      Sep 26 02:13:47.087 W clusteroperator/image-registry condition/Progressing reason/DeploymentNotCompleted status/True Progressing: The deployment has not completed\nNodeCADaemonProgressing: The daemon set node-ca is deployed
      Sep 26 02:13:47.087 - 22s   W clusteroperator/image-registry condition/Progressing reason/DeploymentNotCompleted status/True Progressing: The deployment has not completed\nNodeCADaemonProgressing: The daemon set node-ca is deployed
      Sep 26 02:18:35.532 W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:18:35.532 - 138s  W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:21:37.194 W clusteroperator/image-registry condition/Progressing reason/DeploymentNotCompleted status/True Progressing: The deployment has not completed\nNodeCADaemonProgressing: The daemon set node-ca is deployed
      Sep 26 02:21:37.194 - 25s   W clusteroperator/image-registry condition/Progressing reason/DeploymentNotCompleted status/True Progressing: The deployment has not completed\nNodeCADaemonProgressing: The daemon set node-ca is deployed
      Sep 26 02:25:16.952 W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:25:16.952 - 91s   W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:29:02.947 W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:29:02.947 - 53s   W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:30:32.432 W clusteroperator/image-registry condition/Progressing reason/DeploymentNotCompleted status/True Progressing: The deployment has not completed\nNodeCADaemonProgressing: The daemon set node-ca is deployed
      Sep 26 02:30:32.432 - 24s   W clusteroperator/image-registry condition/Progressing reason/DeploymentNotCompleted status/True Progressing: The deployment has not completed\nNodeCADaemonProgressing: The daemon set node-ca is deployed
      Sep 26 02:32:28.704 W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:32:28.704 - 75s   W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:37:59.562 W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      Sep 26 02:37:59.562 - 58s   W clusteroperator/image-registry condition/Progressing reason/NodeCADaemonUnavailable::Ready status/True Progressing: The registry is ready\nNodeCADaemonProgressing: The daemon set node-ca is deploying node pods
      
      0 unwelcome but acceptable clusteroperator state transitions while machine-config is progressing during the upgrade window from 2025-09-26T01:29:14Z to 2025-09-26T02:39:28Z, as desired.}

      Version-Release number of selected component (if applicable):

      The example job does an upgrade from 4.20.0-0.ci-2025-09-23-172020 to 4.21.0-0.ci-2025-09-26-000607

      How reproducible:

      Seems always for 4.20 - 4.21 upgrades

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

      co does not go Progressing=True only for a node reboot

      Additional info:

      In the cluster scaling up/down tests on AWS, job example, the operator went Progressing=True with reason=AWSEBSCSIDriverOperatorCR_AWSEBSDriverNodeServiceController_Deploying. I feel the root cause is similar to the above case.

      "message": {
                      "reason": "AWSEBSCSIDriverOperatorCR_AWSEBSDriverNodeServiceController_Deploying",
                      "cause": "",
                      "humanMessage": "AWSEBSCSIDriverOperatorCRProgressing: AWSEBSDriverNodeServiceControllerProgressing: Waiting for DaemonSet to update 7 node pods",
                      "annotations": {
                          "condition": "Progressing",
                          "reason": "AWSEBSCSIDriverOperatorCR_AWSEBSDriverNodeServiceController_Deploying",
                          "status": "True"
                      }
                  }, 

              fmissi Flavian Missi
              hongkliu Hongkai Liu
              None
              None
              XiuJuan Wang XiuJuan Wang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: