This is a clone of issue OCPBUGS-42320. The following is the description of the original issue:
—
Description of problem:
Opening this but to report how difficult was to identify an API overload caused by the ignition-server in a disconnected HCP. The customer reported an increased level of network traffic in a ACM cluster hosting 3 HCPs. Troubleshooting the issue required to observe the audit logs of the kube-apiserver in the host cluster were was identified the `ignition-server` serviceaccount generating 382320 requests in 22 hours (it's 289 requests per minute). No alerts were present in the cluster suggesting the ignition-server as source of the issue, and the ignition-server pods were not even restarting. Was possible to identify exactly the ignition-server as source of the issue by disabling it and seen the network traffic dropping in the cluster metrics.
Version-Release number of selected component (if applicable):
OpenShift 4.14, MCE 2.5
How reproducible:
Always at the customer cluster.
Steps to Reproduce:
1. Start a disconnected HCP cluster with incorrect mirror-registry information 2. Verify an increased overload of the API when ignition-server pods start 3.
Actual results:
The ignition-server is continuously failing overloading the host cluster API and it is difficult to identify it.
Expected results:
An alert should be triggered, or the ignition-server pods should fail to start. Or at least not overloading the API.
Additional info:
- blocks
-
OCPBUGS-47533 HCP ignition-server silently overloading the host cluster API if not able to get ignition payload
-
- Closed
-
- clones
-
OCPBUGS-42320 HCP ignition-server silently overloading the host cluster API if not able to get ignition payload
-
- Verified
-
- is blocked by
-
OCPBUGS-42320 HCP ignition-server silently overloading the host cluster API if not able to get ignition payload
-
- Verified
-
- is cloned by
-
OCPBUGS-47533 HCP ignition-server silently overloading the host cluster API if not able to get ignition payload
-
- Closed
-
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update