-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
4.12.z
-
No
-
False
-
-
Description of problem:
The Node Tuning Operator Updated the Kubeletconfig of the cluster. This resulted to the cluster to start a Machine Config Rollout without the user to do any action. In more detail: The node tuning operator updates the kubelet-config: ~~~ 2023-07-18T21:23:27.048335958+03:00 I0718 18:23:27.048303 1 resources.go:159] Update kubelet-config "performance-worker" ~~~ From kubeletconfig CR. ~~~ manager: cluster-node-tuning-operator operation: Update time: "2023-07-18T18:23:27Z" ~~~ A new "99-worker-generated-kubelet" MC is getting created with name "99-worker-generated-kubelet-1" from the Machine Config Controller. A new rendered MC that includes the new kubeletconfig MC is getting created from the controller. ~~~ 2023-07-18T21:26:24.722890720+03:00 I0718 18:26:24.722788 1 render_controller.go:510] Generated machineconfig rendered-worker-0d708d32a1e04ec8b02986c40d10828b from 16 configs: [{MachineConfig 00-worker machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-container-runtime machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-kubelet machineconfiguration.openshift.io/v1 } {MachineConfig 50-nto-worker machineconfiguration.openshift.io/v1 } {MachineConfig 50-performance-worker machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-chrony-conf-override machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-containerruntime machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-kubelet machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-kubelet-1 <---------------------------- This one machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-registries machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-legacy-kdump-configuration machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-ssh machineconfiguration.openshift.io/v1 } {MachineConfig coredump-mc-worker machineconfiguration.openshift.io/v1 } {MachineConfig softpanic-worker machineconfiguration.openshift.io/v1 } {MachineConfig worker-custom-timezone-configuration machineconfiguration.openshift.io/v1 } {MachineConfig worker-std-r750-1-combo-sctp machineconfiguration.openshift.io/v1 }] 2023-07-18T21:26:24.723116641+03:00 I0718 18:26:24.723021 1 event.go:285] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"e104aa2e-0729-49ae-8022-502cbb414ab1", APIVersion:"machineconfiguration. openshift.io/v1", ResourceVersion:"87008874", FieldPath:""}): type: 'Normal' reason: 'RenderedConfigGenerated' rendered-worker-0d708d32a1e04ec8b02986c40d10828b successfully generated (release version: 4.12.8, controller version: 731341b8 9e72d53abb349aff98d09e281e471066) ~~~ This is getting applied to the cluster. The differences between the previous and the new rendered MC are the below: ~~~ [nstamate@fedora ~]$ diff rendered-worker-0d708d32a1e04ec8b02986c40d10828b-kubelet-config.yaml rendered-worker-1102262599b5c40ca2c2aae79f1b35d0-kubelet-config.yaml 48c48,54 < "cpuManagerReconcilePeriod": "0s", --- > "cpuManagerPolicy": "static", > "cpuManagerPolicyOptions": { > "full-pcpus-only": "true" > }, > "cpuManagerReconcilePeriod": "5s", > "memoryManagerPolicy": "Static", > "topologyManagerPolicy": "single-numa-node", 54a61,66 > "evictionHard": { > "imagefs.available": "15%", > "memory.available": "100Mi", > "nodefs.available": "10%", > "nodefs.inodesFree": "5%" > }, 65,68c77,80 < "allowedUnsafeSysctls": [ < "fs.mqueue.*", < "net.*" < ], --- > "kubeReserved": { > "memory": "500Mi" > }, > "reservedSystemCPUs": "0-3,52-55", 79c91,99 < "shutdownGracePeriodCriticalPods": "0s" --- > "shutdownGracePeriodCriticalPods": "0s", > "reservedMemory": [ > { > "numaNode": 0, > "limits": { > "memory": "1100Mi" > } > } > ] ~~~ Then the changes are reverted automatically. ~~~ 2023-07-19T10:39:09.804948915+03:00 I0719 07:39:09.804900 1 resources.go:159] Update kubelet-config "performance-worker" ~~~ And the Machine Config Controller targets the previous rendered MC. A new rollout starts. The admin didn't touch anything. Must-gather and other manifests are in the attached case.
Version-Release number of selected component (if applicable):
N/A
How reproducible:
Cannot reproduce as it happened only once.
Steps to Reproduce:
1. 2. 3.
Actual results:
The Kubelet config is changed by the Node Tuning Operator at random times
Expected results:
The Kubelet config should not be changed by the Node Tuning Operator at random times
Additional info: