-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.17.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
appending new node groups in a settled NROP CR will trigger reboot on nodes of the old node group.
Version-Release number of selected component (if applicable):
<=4.17
How reproducible:
always
Steps to Reproduce:
1.create new mcp without any machines attached to it 2.append the new MCP selector as a new nodegroup 2.watch the original node group nodes being rebooted 3.
Actual results:
nodes get rebooted although update is not affecting them
Expected results:
node groups shouldn't affect each other
Additional info:
shajmakh@shajmakh-thinkpadp16vgen1 ~/ghrepo/numaresources-operator (replace-47674)$ oc get pod,ds,node,mcp NAME READY STATUS RESTARTS AGE pod/numaresources-controller-manager-c8d4b77bf-fvwtl 1/1 Running 0 2d pod/numaresourcesoperator-worker-25z47 2/2 Running 2 12m pod/numaresourcesoperator-worker-7s7xr 2/2 Running 0 11m pod/secondary-scheduler-755b8f4979-jqtc4 1/1 Running 0 45hNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/numaresourcesoperator-mcp-test 0 0 0 0 0 node-role.kubernetes.io/mcp-test= 12m daemonset.apps/numaresourcesoperator-worker 1 1 1 1 1 node-role.kubernetes.io/worker= 2dNAME STATUS ROLES AGE VERSION node/master-0 Ready control-plane,master 2d1h v1.30.10 node/master-1 Ready control-plane,master 2d1h v1.30.10 node/master-2 Ready control-plane,master 2d1h v1.30.10 node/worker-0 Ready worker 2d v1.30.10 node/worker-1 NotReady,SchedulingDisabled worker 2d v1.30.10NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE machineconfigpool.machineconfiguration.openshift.io/master rendered-master-7c77ad8b357395c1ea39c4787caa15e8 True False False 3 3 3 0 2d1h machineconfigpool.machineconfiguration.openshift.io/worker rendered-worker-f3d740589822ab9eb433abadf74da6a8 False True False 2 1 1 0 2d1h shajmakh@shajmakh-thinkpadp16vgen1 ~/ghrepo/numaresources-operator (replace-47674)$ $ oc describe numaresourcesoperator Name: numaresourcesoperator Namespace: Labels: <none> Annotations: <none> API Version: nodetopology.openshift.io/v1 Kind: NUMAResourcesOperator Metadata: Creation Timestamp: 2025-03-09T10:54:30Z Generation: 17 Resource Version: 1081504 UID: e086aabb-2645-4a54-910c-3ae59e762c7c Spec: Log Level: Trace Node Groups: Config: Info Refresh Pause: Disabled Machine Config Pool Selector: Match Labels: pools.operator.machineconfiguration.openshift.io/worker: Status: Conditions: Last Transition Time: 2025-03-11T11:03:54Z Message: Reason: Available Status: False Type: Available Last Transition Time: 2025-03-11T11:03:54Z Message: Reason: Upgradeable Status: False Type: Upgradeable Last Transition Time: 2025-03-11T11:03:54Z Message: Reason: Progressing Status: True Type: Progressing Last Transition Time: 2025-03-11T11:03:54Z Message: Reason: Degraded Status: False Type: Degraded Daemonsets: Name: numaresourcesoperator-worker Namespace: openshift-numaresources Machineconfigpools: Name: worker Name: mcp-test Related Objects: Group: Name: openshift-numaresources Resource: namespaces Group: machineconfiguration.openshift.io Name: Resource: kubeletconfigs Group: machineconfiguration.openshift.io Name: Resource: machineconfigs Group: topology.node.k8s.io Name: Resource: noderesourcetopologies Group: apps Name: numaresourcesoperator-worker Namespace: openshift-numaresources Resource: daemonsets Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRTECreate 11m (x77 over 2d) numaresources-controller Created Resource-Topology-Exporter DaemonSets Normal SuccessfulMCSync 7m1s (x180 over 2d) numaresources-controller Enabled machine configuration for worker nodes Normal SuccessfulCRDInstall 60s (x186 over 2d) numaresources-controller Node Resource Topology CRD installed shajmakh@shajmakh-thinkpadp16vgen1 ~/ghrepo/numaresources-operator (replace-47674)$
- is caused by
-
OCPBUGS-53153 MCO requires reboot on worker nodes when creating new MC (even for MCP with zero machine count)
-
- New
-
- mentioned on