-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
False
-
None
-
False
-
-
-
-
None
Description of problem:
While provisioning many IBI SNOs using the siteconfig operator, there is an experienced delay across the clusters. For example, although all 297 clusterinstances were git pushed and thus created by argocd at approximately the same time, the first ImageClusterInstall object was created only 1s after the clusterinstance was, where as the last ImageClusterInstall object was created a full 341s later or a full 5 minute and 41 second delay between when the responsible clusterinstance was created.
1st clusterinstance creationTimestamp - 2024-10-16 3:16:45
297th clusterinstance creationTimestamp - 2024-10-16 3:16:46
1st ImageClusterInstall creationTimestamp - 2024-10-16 3:16:46
297th ImageClusterInstall creationTimestamp - 2024-10-16 3:22:27
The delay can also be shown by looking at the clusterinstance "ClusterInstanceValidated" condition lastTransitionTime in relation to the creationTimestamp. We observe that the delay grows for every cluster 1-3s with occasional large growths in delay of 15s or 30s.
The delay is representative of a serialized task:
Stats on ClusterInstances CRs with CreationTimeStamp until InstanceValidated Timestamp Count: 297 Min: 1.0 Average: 205.4 50 percentile: 207.0 95 percentile: 327.2 99 percentile: 337.0 Max: 340.0
Attached is a graph showing the delay as well.
Version-Release number of selected component (if applicable):
OCP Hub and IBI deployed clusters 4.17.1
ACM 2.12 - 2.12.0-DOWNSTREAM-2024-10-10-17-47-25
How reproducible:
Steps to Reproduce:
- ...