-
Ticket
-
Resolution: Unresolved
-
Critical
-
None
-
None
-
Incidents & Support
-
False
-
-
False
-
None
-
None
-
None
-
None
We are seeing install issues across many different AWS jobs, payloads and otherwise. The failures look like:
level=info msg=Creating infrastructure resources... level=info msg=Reconciling IAM roles for control-plane and compute nodes level=info msg=Creating IAM role for master level=error msg=failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed during pre-provisioning: failed to create IAM roles: failed to get master instance profile: operation error IAM: GetInstanceProfile, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: 03248d48-8f48-45c9-a876-1920b143c4e7, api error Throttling: Rate exceeded Installer exit with code 4
There is some discussion about this here, as well as here.
This is resultant of the AWS SDK v2 client (which is also in use in 4.20). This has been in use for quite awhile now, and we are not sure why all of the sudden it is failing so frequently. Speculation is that it is due to frequency of jobs in the accounts.
This means that there really isn't anything to revert here. The potential fix is: https://github.com/openshift/installer/pull/10112
- relates to
-
CORS-4055 Remove AWS SDK V1 from asset/installconfig/aws
-
- Code Review
-