-
Bug
-
Resolution: Done-Errata
-
Normal
-
None
-
4.13, 4.16
-
Low
-
No
-
MCO Sprint 257
-
1
-
False
-
-
Prevent no-operation patches when a node status is updated, from overloading the kube-api server with unnecessary calls
-
Release Note Not Required
-
In Progress
Description of problem
I haven't traced out the trigger-pathway yet, but 4.13 and 4.16 machine-config controllers seem to react to Node status updates with makeMasterNodeUnSchedulable calls that result in Kubernetes API PATCH calls, even when the patch being requested is empty. This creates unnecessary API volume, loading the control plane and resulting in distracting Kube-API audit log activity.
Version-release number of selected component (if applicable)
Seen in 4.13.34, and reproduced in 4.16 CI builds, so likely all intervening versions. Possibly all versions.
How reproducible
Every time.
Steps to reproduce
mco#4277 reproduces this by reverting OCPBUGS-29713 to get frequent Node status updates. And then in presubmit CI, e2e-aws-ovn > Artifacts > ... > gather-extra pod logs :
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_machine-config-operator/4277/pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn/1770970949110206464/artifacts/e2e-aws-ovn/gather-extra/artifacts/pods/openshift-machine-config-operator_machine-config-controller-bdcdf554f-ct5hh_machine-config-controller.log | tail
and:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_machine-config-operator/4277/pull-ci-openshift-machine-config-operator-master-e2e-aws-ovn/1770970949110206464/artifacts/e2e-aws-ovn/gather-extra/artifacts/pods/openshift-machine-config-operator_machine-config-controller-bdcdf554f-ct5hh_machine-config-controller.log | grep -o 'makeMasterNodeUnSchedulable\|UpdateNodeRetry' | sort | uniq -c
give...
Actual results
I0322 02:10:13.027938 1 node_controller.go:310] makeMasterNodeUnSchedulable ip-10-0-34-167.us-east-2.compute.internal I0322 02:10:13.029671 1 kubeutils.go:48] UpdateNodeRetry patch ip-10-0-34-167.us-east-2.compute.internal: {} I0322 02:10:13.669568 1 node_controller.go:310] makeMasterNodeUnSchedulable ip-10-0-84-206.us-east-2.compute.internal I0322 02:10:13.671023 1 kubeutils.go:48] UpdateNodeRetry patch ip-10-0-84-206.us-east-2.compute.internal: {} I0322 02:10:21.095260 1 node_controller.go:310] makeMasterNodeUnSchedulable ip-10-0-114-0.us-east-2.compute.internal I0322 02:10:21.098410 1 kubeutils.go:48] UpdateNodeRetry patch ip-10-0-114-0.us-east-2.compute.internal: {} I0322 02:10:23.215168 1 node_controller.go:310] makeMasterNodeUnSchedulable ip-10-0-34-167.us-east-2.compute.internal I0322 02:10:23.219672 1 kubeutils.go:48] UpdateNodeRetry patch ip-10-0-34-167.us-east-2.compute.internal: {} I0322 02:10:24.049456 1 node_controller.go:310] makeMasterNodeUnSchedulable ip-10-0-84-206.us-east-2.compute.internal I0322 02:10:24.050939 1 kubeutils.go:48] UpdateNodeRetry patch ip-10-0-84-206.us-east-2.compute.internal: {}
showing frequent, no-op patch attempts and:
1408 makeMasterNodeUnSchedulable 1414 UpdateNodeRetry
showing many attempts over the life of the MCC container.
Expected results
No need to PATCH on makeMasterNodeUnSchedulable unless the generated patch content contained more than a no-op patch.
Additional info
setDesiredMachineConfigAnnotation has a "DesiredMachineConfigAnnotationKey already matches what I want" no-op out here
setUpdateInProgressTaint seems to lack a similar guard here , and it should probably grow a check for "NodeUpdateInProgressTaint is already present". Same for removeUpdateInProgressTaint. But the hot loop in response to these Node status updates is makeMasterNodeUnSchedulable calling UpdateNodeRetry, and UpdateNodeRetry also lacks this kind of "no need to PATCH when no changes are requested" logic.
For this bug, we should:
- Add that kind of guard somewhere to the makeMasterNodeUnSchedulable stack. I'd personally recommend putting it down at UpdateNodeRetry.
- Optionally add similar guards to setUpdateInProgressTaint and removeUpdateInProgressTaint (or port those to use UpdateNodeRetry? If you port, you might want to port setDesiredMachineConfigAnnotation to, just for consistency).
- Optionally add logging like I'm floating in mco#4277, so it's easier to understand why the MCC thinks it needs to take externally-visible action, to help debug future "these Kube API server audit logs have the MCC doing surprising stuff..." situations.
- links to
-
RHEA-2024:3718 OpenShift Container Platform 4.17.z bug fix update