-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.18.z, 4.19.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
2025-10-15: in discussions to change this to a documentation bug across all relevant versions
-
None
-
CNF Compute Sprint 268, CNF Compute Sprint 269, CNF Compute Sprint 270, CNF Compute Sprint 271, CNF Compute Sprint 272
-
5
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
see steps please.
Version-Release number of selected component (if applicable):
so far >=4.18
How reproducible:
always
Steps to Reproduce:
1. Add a new mcp on a cluster that has NROP installed and deployed with custom RTE selinux policy enabled. The new MCP should stay empty. Add it as a new node group under NROP CR.
Actual results:
although the new mcp is "updated", the NROP CR stays in progressing state and never available:
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE machineconfigpool.machineconfiguration.openshift.io/master rendered-master-e7416b38d7bdae0e580eb1578bc2400b True False False 3 3 3 0 14d machineconfigpool.machineconfiguration.openshift.io/worker rendered-worker-6ef79ab49669b7064068cc58ecad9c01 False True False 2 0 0 0 14d machineconfigpool.machineconfiguration.openshift.io/worker-cnf rendered-worker-cnf-da770c3a1535f54e6ef8e6e8aac9a254 True False False 0 0 0 0 31mNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/numaresourcesoperator-worker 2 2 2 2 2 node-role.kubernetes.io/worker= 73m shajmakh@shajmakh-thinkpadp16vgen1 ~ $ oc describe numaresourcesoperator Spec: Log Level: Normal Node Groups: Machine Config Pool Selector: Match Labels: pools.operator.machineconfiguration.openshift.io/worker: Machine Config Pool Selector: Match Labels: machineconfiguration.openshift.io/role: worker-cnf Status: Conditions: Last Transition Time: 2025-03-10T15:45:07Z Message: Reason: Available Status: False Type: Available Last Transition Time: 2025-03-10T15:45:07Z Message: Reason: Upgradeable Status: False Type: Upgradeable Last Transition Time: 2025-03-10T15:45:07Z Message: worker is updating Reason: MachineConfigPoolIsUpdating Status: True Type: Progressing Last Transition Time: 2025-03-10T15:45:07Z Message: Reason: Degraded Status: False Type: Degraded Daemonsets: Name: numaresourcesoperator-worker Namespace: numaresources Machineconfigpools: Name: worker Node Groups: Config: Info Refresh Mode: Periodic Info Refresh Pause: Disabled Info Refresh Period: 10s Pods Fingerprinting: EnabledExclusiveResources Daemonsets: Name: numaresourcesoperator-worker Namespace: numaresources Selector: worker
Expected results:
NROP should be available (create the rte ds; no pods are expected) and let the mco controller handle updates on mcps the nrop controller will still watch for updates there.
Additional info:
set severity as moderate considering the workaround is simply delete the empty mcp or remove the custom policy annotation; without the workaround it is considered a blocker. when the custom policy annotation is removed, behavior is normal again: er= 86m shajmakh@shajmakh-thinkpadp16vgen1 ~ $ oc get node,mcp,ds NAME STATUS ROLES AGE VERSION node/cnfdr11.telco5g.eng.rdu2.redhat.com Ready worker 14d v1.31.5 node/cnfdr9.telco5g.eng.rdu2.redhat.com Ready worker 14d v1.31.5 node/dhcp-10-1-105-178.telco5g.eng.rdu2.redhat.com Ready control-plane,master,virtual 14d v1.31.5 node/dhcp-10-1-105-221.telco5g.eng.rdu2.redhat.com Ready control-plane,master,virtual 14d v1.31.5 node/dhcp-10-1-105-44.telco5g.eng.rdu2.redhat.com Ready control-plane,master,virtual 14d v1.31.5NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE machineconfigpool.machineconfiguration.openshift.io/master rendered-master-e7416b38d7bdae0e580eb1578bc2400b True False False 3 3 3 0 14d machineconfigpool.machineconfiguration.openshift.io/worker rendered-worker-da770c3a1535f54e6ef8e6e8aac9a254 True False False 2 2 2 0 14d machineconfigpool.machineconfiguration.openshift.io/worker-cnf rendered-worker-cnf-da770c3a1535f54e6ef8e6e8aac9a254 True False False 0 0 0 0 49mNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/numaresourcesoperator-worker 2 2 2 2 2 node-role.kubernetes.io/worker= 92m daemonset.apps/numaresourcesoperator-worker-cnf 0 0 0 0 0 node-role.kubernetes.io/worker-cnf= 2m9s shajmakh@shajmakh-thinkpadp16vgen1 ~ $ The behavior also occures when default selinux policy is controlling, the difference in the output would be "DaemonSetIsUpdating" instead of "MachineConfigPoolIsUpdating"