Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.21
Component/s: Cloud Compute / ControlPlaneMachineSet
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:


To test the new boot image update functionality for ControlPlaneMachineset, in MCO we update the ControlPlaneMachineset with the right image and then, in order to verify that it is working fine, we remove a control plane machine to recreate it.  The controlplanemachineset can properly replace it. Nevertheless  it reports this condition
    {
      "lastTransitionTime": "2025-11-05T10:17:31Z",
      "message": "Found 1 unmanaged node(s)",
      "reason": "UnmanagedNodes",
      "status": "True",
      "type": "Degraded"
    },
Causing a failure in this test
 [Monitor:legacy-cvo-invariants][bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded expand_less	4h40m17s
{  2 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:

Nov 05 16:03:46.214 E clusteroperator/control-plane-machine-set condition/Degraded reason/UnmanagedNodes status/True Found 1 unmanaged node(s)
Nov 05 16:03:46.214 - 315ms E clusteroperator/control-plane-machine-set condition/Degraded reason/UnmanagedNodes status/True Found 1 unmanaged node(s)

1 unwelcome but acceptable clusteroperator state transitions during e2e test run.  These should not happen, but because they are tied to exceptions, the fact that they did happen is not sufficient to cause this test-case to fail:

Nov 05 16:03:46.530 W clusteroperator/control-plane-machine-set condition/Degraded reason/AsExpected status/False (exception: Degraded=False is the happy case)
}

Version-Release number of selected component (if applicable):

IPI on AWS version 4.21

How reproducible:

Always. Or at least very frequently.

Steps to Reproduce:

    1.Delete a control plane machine
    2.Wait for the control plane machine to be replaced

Actual results:


The controlplanemachineset resource blips  and reports a temporary degradation even if no problem happened

    {
      "lastTransitionTime": "2025-11-05T10:17:31Z",
      "message": "Found 1 unmanaged node(s)",
      "reason": "UnmanagedNodes",
      "status": "True",
      "type": "Degraded"
    },

This blip causes a failure in this test in the CI


 [Monitor:legacy-cvo-invariants][bz-Cloud Compute] clusteroperator/control-plane-machine-set should not change condition/Degraded expand_less	4h40m17s
{  2 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:

Nov 05 16:03:46.214 E clusteroperator/control-plane-machine-set condition/Degraded reason/UnmanagedNodes status/True Found 1 unmanaged node(s)
Nov 05 16:03:46.214 - 315ms E clusteroperator/control-plane-machine-set condition/Degraded reason/UnmanagedNodes status/True Found 1 unmanaged node(s)

1 unwelcome but acceptable clusteroperator state transitions during e2e test run.  These should not happen, but because they are tied to exceptions, the fact that they did happen is not sufficient to cause this test-case to fail:

Nov 05 16:03:46.530 W clusteroperator/control-plane-machine-set condition/Degraded reason/AsExpected status/False (exception: Degraded=False is the happy case)
}

Expected results:

Additional info:


More information in this slack conversartion: https://redhat-internal.slack.com/archives/CBZHF4DHC/p1762439419291169

Assignee:: Damiano Donati

Reporter:: Sergio Regidor de la Rosa

QA Contact:: Zhaohua Sun

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/11/07 10:07 AM

Updated:: 2025/12/03 4:13 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates