-
Bug
-
Resolution: Done-Errata
-
Undefined
-
4.12.z
This is a clone from MGMT-11551, to backport the PR https://github.com/openshift/assisted-installer-agent/pull/438 and https://github.com/openshift/assisted-installer-agent/pull/442 into agent-based-installer 4.12.z.
Description of the problem:
Currently, in 4.12.z, when the agent encounters seemingly irrecoverable errors it sleeps forever
This is not ideal because we're not truly confident that those errors are truly irrecoverable, and retrying might save the day. To avoid generating too much noise from such agents, the retry delay algorithm should use exponential back off.
How reproducible:
Single occurrence on last few months, while running ~100 installation jobs per weekend in an CI pipelines.
Steps to reproduce:
1. N/A, potential race condition
2.
3.
Actual results:
- Failed to bootstrap
Expected results:
- Successful installation.
- is blocked by
-
OCPBUGS-23556 [4.13] - Deleting host from a cluster --> Host registers again after 15 mins without rebooting it
- Closed
- links to
-
RHBA-2023:7823 OpenShift Container Platform 4.12.z bug fix update