-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.16, 4.17
-
Moderate
-
None
-
MCO Sprint 257
-
1
-
False
-
-
Release Note Not Required
-
In Progress
Description of problem
Seen in a 4.17 nightly-to-nightly CI update:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-upgrade/1809154554084724736/artifacts/e2e-aws-ovn-upgrade/gather-extra/artifacts/events.json | jq -r '.items[] | select(.metadata.namespace == "openshift-machine-config-operator") | .reason' | sort | uniq -c | sort -n | tail -n3 82 Pulled 82 Started 2116 ValidatingAdmissionPolicyUpdated $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-upgrade/1809154554084724736/artifacts/e2e-aws-ovn-upgrade/gather-extra/artifacts/events.json | jq -r '.items[] | select(.metadata.namespace == "openshift-machine-config-operator" and .reason == "ValidatingAdmissionPolicyUpdated").message' | sort | uniq -c 705 Updated ValidatingAdmissionPolicy.admissionregistration.k8s.io/machine-configuration-guards because it changed 705 Updated ValidatingAdmissionPolicy.admissionregistration.k8s.io/managed-bootimages-platform-check because it changed 706 Updated ValidatingAdmissionPolicy.admissionregistration.k8s.io/mcn-guards because it changed
I'm not sure what those are about (which may be a bug on it's own? Would be nice to know what changed), but it smells like a hot loop to me.
Version-Release number of selected component
Seen in 4.17. Not clear yet how to audit for exposure frequency or versions, short of teaching the origin test suite to fail if it sees too many of these kinds of events? Maybe a for openshift-... namespaces version of the current events should not repeat pathologically in e2e namespaces test-case? Which we may have, but it's not tripping?
How reproducible
Besides the initial update, also seen in this 4.17.0-0.nightly-2024-07-05-091056 serial run:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.17-e2e-aws-ovn-serial/1809154615350923264/artifacts/e2e-aws-ovn-serial/gather-extra/artifacts/events.json | jq -r '.items[] | select(.metadata.namespace == "openshift-machine-config-operator" and .reason == "ValidatingAdmissionPolicyUpdated").message' | sort | uniq -c 1006 Updated ValidatingAdmissionPolicy.admissionregistration.k8s.io/machine-configuration-guards because it changed 1006 Updated ValidatingAdmissionPolicy.admissionregistration.k8s.io/managed-bootimages-platform-check because it changed 1007 Updated ValidatingAdmissionPolicy.admissionregistration.k8s.io/mcn-guards because it changed
So possibly every time, in all 4.17 clusters?
Steps to Reproduce
1. Unclear. Possibly just install 4.17.
2. Run oc -n openshift-machine-config-operator get -o json events | jq -r '.items[] | select(.reason == "ValidatingAdmissionPolicyUpdated")'.
Actual results
Thousands of hits.
Expected results
Zero to few hits.
- clones
-
OCPBUGS-36654 Machine-config operator should not hot loop generating ValidatingAdmissionPolicyUpdated events
- Closed
- depends on
-
OCPBUGS-36654 Machine-config operator should not hot loop generating ValidatingAdmissionPolicyUpdated events
- Closed
- links to
-
RHBA-2024:4965 OpenShift Container Platform 4.16.z bug fix update