-
Task
-
Resolution: Unresolved
-
Normal
-
None
-
None
Overview:
There have been a couple of CI failures for the scanner-v4-install-tests test suite (e.g. ROX-28514), which are caused by temporary unavailability of the Kube API server during `helm install` invocations.
In such a situation `helm install` fails as follows:
INFO: Wed Mar 12 10:18:00 UTC 2025: [deploy-stackrox] Error: rendered manifests contain a resource that already exists. Unable to continue with install: could not get information about the resource ClusterRoleBinding "stackrox:review-tokens-binding" in namespace "": an error on the server ("Internal Server Error: \"/apis/rbac.authorization.k8s.io/v1/clusterrolebindings/stackrox:review-tokens-binding\": the server is currently unable to handle the request") has prevented the request from succeeding (get clusterrolebindings.rbac.authorization.k8s.io stackrox:review-tokens-binding)
Apparently `helm` does not currently have built-in functionality for doing retries automatically. Therefore we might be forced to write our own helm CLI wrapper, similar to the `retry-kubectl.sh` script, which we are using already in CI.