Uploaded image for project: 'Red Hat Advanced Cluster Security'
  1. Red Hat Advanced Cluster Security
  2. ROX-28586

Retry helm CLI invocations on temporary Kube API server unavailability

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • CI
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • 0

      Overview:

      There have been a couple of CI failures for the scanner-v4-install-tests test suite (e.g. ROX-28514), which are caused by temporary unavailability of the Kube API server during `helm install` invocations.

      In such a situation `helm install` fails as follows:

      INFO: Wed Mar 12 10:18:00 UTC 2025: [deploy-stackrox] Error: rendered manifests contain a resource that already exists. Unable to continue with install: could not get information about the resource ClusterRoleBinding "stackrox:review-tokens-binding" in namespace "": an error on the server ("Internal Server Error: \"/apis/rbac.authorization.k8s.io/v1/clusterrolebindings/stackrox:review-tokens-binding\": the server is currently unable to handle the request") has prevented the request from succeeding (get clusterrolebindings.rbac.authorization.k8s.io stackrox:review-tokens-binding)

       

      Apparently `helm` does not currently have built-in functionality for doing retries automatically. Therefore we might be forced to write our own helm CLI wrapper, similar to the `retry-kubectl.sh` script, which we are using already in CI.

              Unassigned Unassigned
              mclasmei@redhat.com Moritz Clasmeier
              ACS Install
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: