-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.16.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Master MCP degraded due to: failed to set annotations on node: unable to update node during HTTP/2 connection loss to API server. This triggers full configuration reconciliation despite no actual MCP config changes, causing pod restarts.
Version-Release number of selected component (if applicable):
OCP 4.16.37 / Baremetal 3 master + 2 workers
Additional info:
The master MCP has the render configuration from 2025-06-24, and without applying any changes to the render, the configuration is reconciled again in 2025-09-02 due to an apparent API problem. We need to understand if this behavior is expected.
omc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-015dd60bc71e16c60b395ee004216e4c True False False 3 3 3 0 85d worker rendered-worker-b4ad4ae002ac30931206364429246797 True False False 2 2 2 0 85d omc get mc -o json | jq -r ' .items[] | select(.metadata.name|test("^rendered-master-")) | [.metadata.name, .metadata.creationTimestamp] | @tsv' | sort -k2 rendered-master-c19e70676f3c17344ba77ed244fc8793 2025-06-24T13:49:46Z rendered-master-1001201a675b96d1cb226329c79317fa 2025-06-24T14:12:11Z rendered-master-015dd60bc71e16c60b395ee004216e4c 2025-06-24T14:12:42Z omc get nodes master1 -o yaml | grep desiredConfig machineconfiguration.openshift.io/desiredConfig: rendered-master-015dd60bc71e16c60b395ee004216e4c omc get mcp master -o jsonpath='{range .status.conditions[*]}{.type}{"\t"}{.status}{"\t"}{.lastTransitionTime}{"\n"}{end}' RenderDegraded False 2025-06-24T13:49:47Z Updated True 2025-09-02T18:00:05Z Updating False 2025-09-02T18:00:05Z NodeDegraded False 2025-09-02T18:00:05Z Degraded False 2025-09-02T18:00:05Z
omc logs machine-config-daemon-vfppr -n openshift-machine-config-operator -c machine-config-daemon ... 2025-09-02T17:55:49.145320345Z I0902 17:55:49.145269 11038 certificate_writer.go:303] Certificate was synced from controllerconfig resourceVersion 46102033 2025-09-02T17:58:34.378523828Z W0902 17:58:34.378487 11038 reflector.go:462] github.com/openshift/client-go/config/informers/externalversions/factory.go:125: watch of *v1.FeatureGate ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2025-09-02T17:58:34.378523828Z W0902 17:58:34.378487 11038 reflector.go:462] github.com/openshift/client-go/config/informers/externalversions/factory.go:125: watch of *v1.ClusterVersion ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2025-09-02T17:58:34.378553754Z W0902 17:58:34.378492 11038 reflector.go:462] k8s.io/client-go/informers/factory.go:159: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2025-09-02T17:58:34.378553754Z W0902 17:58:34.378523 11038 reflector.go:462] github.com/openshift/client-go/machineconfiguration/informers/externalversions/factory.go:125: watch of *v1.MachineConfig ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2025-09-02T17:58:34.378553754Z W0902 17:58:34.378492 11038 reflector.go:462] github.com/openshift/client-go/machineconfiguration/informers/externalversions/factory.go:125: watch of *v1.ControllerConfig ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2025-09-02T17:58:36.256191225Z W0902 17:58:36.256157 11038 reflector.go:462] k8s.io/client-go/informers/factory.go:159: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2025-09-02T17:58:36.256310943Z E0902 17:58:36.256295 11038 writer.go:226] Marking Degraded due to: failed to set annotations on node: unable to update node "&Node{ObjectMeta:{ 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:NodeSpec{PodCIDR:,DoNotUseExternalID:,ProviderID:,Unschedulable:false,Taints:[]Taint{},ConfigSource:nil,PodCIDRs:[],},Status:NodeStatus{Capacity:ResourceList{},Allocatable:ResourceList{},Phase:,Conditions:[]NodeCondition{},Addresses:[]NodeAddress{},DaemonEndpoints:NodeDaemonEndpoints{KubeletEndpoint:DaemonEndpoint{Port:0,},},NodeInfo:NodeSystemInfo{MachineID:,SystemUUID:,BootID:,KernelVersion:,OSImage:,ContainerRuntimeVersion:,KubeletVersion:,KubeProxyVersion:,OperatingSystem:,Architecture:,},Images:[]ContainerImage{},VolumesInUse:[],VolumesAttached:[]AttachedVolume{},Config:nil,},}": Patch "https://api-int.demai.cloudran.telefonica.net:6443/api/v1/nodes/master1": http2: client connection lost 2025-09-02T17:58:36.291669375Z I0902 17:58:36.291625 11038 certificate_writer.go:303] Certificate was synced from controllerconfig resourceVersion 46102033 2025-09-02T17:58:36.837510235Z I0902 17:58:36.837473 11038 daemon.go:739] Transitioned from state: Done -> Degraded 2025-09-02T17:58:36.837510235Z I0902 17:58:36.837493 11038 daemon.go:742] Transitioned from degraded/unreconcilable reason -> failed to set annotations on node: unable to update node "&Node{ObjectMeta:{ 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:NodeSpec{PodCIDR:,DoNotUseExternalID:,ProviderID:,Unschedulable:false,Taints:[]Taint{},ConfigSource:nil,PodCIDRs:[],},Status:NodeStatus{Capacity:ResourceList{},Allocatable:ResourceList{},Phase:,Conditions:[]NodeCondition{},Addresses:[]NodeAddress{},DaemonEndpoints:NodeDaemonEndpoints{KubeletEndpoint:DaemonEndpoint{Port:0,},},NodeInfo:NodeSystemInfo{MachineID:,SystemUUID:,BootID:,KernelVersion:,OSImage:,ContainerRuntimeVersion:,KubeletVersion:,KubeProxyVersion:,OperatingSystem:,Architecture:,},Images:[]ContainerImage{},VolumesInUse:[],VolumesAttached:[]AttachedVolume{},Config:nil,},}": Patch "https://api-int.demai.cloudran.telefonica.net:6443/api/v1/nodes/master1": http2: client connection lost 2025-09-02T17:58:36.842880863Z W0902 17:58:36.842849 11038 daemon.go:2383] current+desiredConfig is rendered-master-015dd60bc71e16c60b395ee004216e4c but state is Degraded 2025-09-02T17:58:36.905278530Z I0902 17:58:36.905253 11038 rpm-ostree.go:308] Running captured: rpm-ostree kargs 2025-09-02T17:58:37.169799357Z I0902 17:58:37.169764 11038 daemon.go:935] Preflight config drift check successful (took 323.184747ms) 2025-09-02T17:58:37.174339580Z I0902 17:58:37.174312 11038 config_drift_monitor.go:255] Config Drift Monitor has shut down 2025-09-02T17:58:37.174339580Z I0902 17:58:37.174329 11038 update.go:2631] Adding SIGTERM protection 2025-09-02T17:58:37.221332855Z I0902 17:58:37.221303 11038 update.go:1019] Checking Reconcilable for config rendered-master-015dd60bc71e16c60b395ee004216e4c to rendered-master-015dd60bc71e16c60b395ee004216e4c 2025-09-02T17:58:37.322629406Z I0902 17:58:37.322598 11038 update.go:2609] Starting update from rendered-master-015dd60bc71e16c60b395ee004216e4c to rendered-master-015dd60bc71e16c60b395ee004216e4c: &{osUpdate:false kargs:false fips:false passwd:false files:false units:false kernelType:false