-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.18.z, 4.19.z, 4.20.z, 4.21.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
Proposed
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
IPI installations can fail to find existing COS instances (during bringup for CAPI based installations - 4.19+, or during bootstrap destroy/cleanup processing). This lookup failure also causes two COS instances to be generated (with same name), as the first one cannot be found.
Version-Release number of selected component (if applicable):
4.20
How reproducible:
100% - although since pagination is involved, the order of returned instances matters, requiring specifically named clusters or COS instances to reproduce
Steps to Reproduce:
1. Create an excessive number of COS instances in an IBM Cloud account (say 40 to 50 or more)
2. Run an IBM Cloud IPI build
Actual results:
level=fatal msg=error destroying bootstrap resources failed during the destroy bootstrap hook: failed retrieving cos instance for destroy bootstrap: COS Resource Not Found
time="2025-10-30T13:46:30-04:00" level=debug msg="checking for existing cos instance: pbalogh-cos27in-snvsr-cos" time="2025-10-30T13:46:31-04:00" level=debug msg="creating cos instance: pbalogh-cos27in-snvsr-cos" time="2025-10-30T13:46:35-04:00" level=debug msg="created cos instance: pbalogh-cos27in-snvsr-cos" time="2025-10-30T13:46:35-04:00" level=debug msg="checking for existing cos bucket: pbalogh-cos27in-snvsr-vsi-image" time="2025-10-30T13:46:35-04:00" level=debug msg="creating cos bucket: pbalogh-cos27in-snvsr-vsi-image" time="2025-10-30T13:46:37-04:00" level=debug msg="created cos bucket: pbalogh-cos27in-snvsr-vsi-image"
time="2025-10-30T14:07:15-04:00" level=debug msg="retrieved resource group id: 5e5ba1b22ec24020a2b0ce50b273eabb" time="2025-10-30T14:07:16-04:00" level=debug msg="creating cos instance: pbalogh-cos27in-snvsr-cos" time="2025-10-30T14:07:19-04:00" level=debug msg="created cos instance: pbalogh-cos27in-snvsr-cos" time="2025-10-30T14:07:19-04:00" level=debug msg="fetching cos instance for cluster: pbalogh-cos27in-snvsr-cos" time="2025-10-30T14:07:19-04:00" level=debug msg="creating cos bucket for bootstrap ignition config: pbalogh-cos27in-snvsr-bootstrap-ignition" time="2025-10-30T14:07:20-04:00" level=info msg="created cos bucket for bootstrap ignition config: pbalogh-cos27in-snvsr-cos/pbalogh-cos27in-snvsr-bootstrap-ignition" time="2025-10-30T14:07:20-04:00" level=debug msg="uploading bootstrap ignition config to bucket: pbalogh-cos27in-snvsr-bootstrap-ignition" time="2025-10-30T14:07:20-04:00" level=debug msg="bootstrap ignition config upload complete to pbalogh-cos27in-snvsr-cos/pbalogh-cos27in-snvsr-bootstrap-ignition/bootstrap.ign"
Expected results:
Successful IPI cluster creation, no duplicate COS instances during creation, no orphaned COS instances (from bootstrap resources)
Additional info:
This appears likely to only have been introduced during the migration to CAPI based IPI support, when the GetCOSInstanceByName was added (4.18+). https://github.com/openshift/installer/blob/e064c5ffbac163a2d6999fe20273054ebfbafcb6/pkg/asset/installconfig/ibmcloud/client.go#L692-L711 This pagination may also affect COS Bucket lookup https://github.com/openshift/installer/blob/e064c5ffbac163a2d6999fe20273054ebfbafcb6/pkg/asset/installconfig/ibmcloud/client.go#L645-L667 which might benefit from a fix too, if that is the case. ResourceController API has very basic return content https://github.com/IBM/platform-services-go-sdk/blob/7a608d80bbd7b6224ee5234b1a4e9afaba27aa23/resourcecontrollerv2/resource_controller_v2.go#L5007-L5017 But use of the Pager struct would be helpful https://github.com/IBM/platform-services-go-sdk/blob/7a608d80bbd7b6224ee5234b1a4e9afaba27aa23/resourcecontrollerv2/resource_controller_v2.go#L5654-L5668 in place of the call to ListResourceInstancesWithContext