Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-64782

Reporting degraded status while correctly recreating a control plane machine

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      
      To test the new boot image update functionality for ControlPlaneMachineset, in MCO we update the ControlPlaneMachineset with the right image and then, in order to verify that it is working fine, we remove a control plane machine to recreate it.  The controlplanemachineset can properly replace it. Nevertheless  it reports this condition
          {
            "lastTransitionTime": "2025-11-05T10:17:31Z",
            "message": "Found 1 unmanaged node(s)",
            "reason": "UnmanagedNodes",
            "status": "True",
            "type": "Degraded"
          },
      Causing a failure in this test
       [Monitor:legacy-cvo-invariants][bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded expand_less	4h40m17s
      {  2 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:
      
      Nov 05 16:03:46.214 E clusteroperator/control-plane-machine-set condition/Degraded reason/UnmanagedNodes status/True Found 1 unmanaged node(s)
      Nov 05 16:03:46.214 - 315ms E clusteroperator/control-plane-machine-set condition/Degraded reason/UnmanagedNodes status/True Found 1 unmanaged node(s)
      
      1 unwelcome but acceptable clusteroperator state transitions during e2e test run.  These should not happen, but because they are tied to exceptions, the fact that they did happen is not sufficient to cause this test-case to fail:
      
      Nov 05 16:03:46.530 W clusteroperator/control-plane-machine-set condition/Degraded reason/AsExpected status/False (exception: Degraded=False is the happy case)
      }
      
          

      Version-Release number of selected component (if applicable):

      IPI on AWS version 4.21 
          

      How reproducible:

      Always. Or at least very frequently.
          

      Steps to Reproduce:

          1.Delete a control plane machine
          2.Wait for the control plane machine to be replaced
          
          

      Actual results:

      
      The controlplanemachineset resource blips  and reports a temporary degradation even if no problem happened
      
          {
            "lastTransitionTime": "2025-11-05T10:17:31Z",
            "message": "Found 1 unmanaged node(s)",
            "reason": "UnmanagedNodes",
            "status": "True",
            "type": "Degraded"
          },
      
      This blip causes a failure in this test in the CI
      
      
       [Monitor:legacy-cvo-invariants][bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded expand_less	4h40m17s
      {  2 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:
      
      Nov 05 16:03:46.214 E clusteroperator/control-plane-machine-set condition/Degraded reason/UnmanagedNodes status/True Found 1 unmanaged node(s)
      Nov 05 16:03:46.214 - 315ms E clusteroperator/control-plane-machine-set condition/Degraded reason/UnmanagedNodes status/True Found 1 unmanaged node(s)
      
      1 unwelcome but acceptable clusteroperator state transitions during e2e test run.  These should not happen, but because they are tied to exceptions, the fact that they did happen is not sufficient to cause this test-case to fail:
      
      Nov 05 16:03:46.530 W clusteroperator/control-plane-machine-set condition/Degraded reason/AsExpected status/False (exception: Degraded=False is the happy case)
      }
      
          

      Expected results:

          

      Additional info:

      
      More information in this slack conversartion: https://redhat-internal.slack.com/archives/CBZHF4DHC/p1762439419291169
      
          

              ddonati@redhat.com Damiano Donati
              sregidor@redhat.com Sergio Regidor de la Rosa
              None
              None
              Zhaohua Sun Zhaohua Sun
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: