Uploaded image for project: 'Cost Management'
  1. Cost Management
  2. COST-5307

[Case 03876979, 03879174]: costmanagement-metrics-operator failing to install on ROSA HCP

XMLWordPrintable

    • 1
    • False
    • None
    • False

      We have two customers trying to install the operator on different ROSA HCP clusters with the same issue on separate support cases. They follow the same steps they would use to install it on a ROSA Classic cluster and the installation fails.

      I have looked at the logs and found similar errors and warnings:

      Warning:

      time="2024-07-19T11:07:53Z" level=warning msg="needs reinstall: waiting for deployment costmanagement-metrics-operator to become ready: deployment \"costmanagement-metrics-operator\" not available: Deployment does not have minimum availability." csv=costmanagement-metrics-operator.3.3.0 id=XaYc2 namespace=costmanagement-metrics-operator phase=Failed strategy=deployment

      Error:
       machine_controller.go:641] "Drain failed, retry in 20s" err="[ ... error when waiting for pod \"costmanagement-metrics-operator-...\" terminating: global timeout reached: 20s

      Other:
      event.go:298] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"costmanagement-metrics-operator", Name:"costmanagement-metrics-operator.3.3.0", UID:"...", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"40971299", FieldPath:""}): type: 'Warning' reason: 'InstallComponentFailed' install strategy failed: Internal error occurred: failed calling webhook "validate.kyverno.svc-fail": failed to call webhook: Post "https://kyverno-svc.kyverno.svc:443/validate/fail?timeout=10s": no endpoints available for service "kyverno-svc"

      I have discussed this with ROSA HCP SRE and they don't see a platform issue, would it be possible to check if this may have to do with an architectural difference that the operator has not accounted for?

            mskarbek Michael Skarbek
            rhn-support-ngareaga Natalia Garea Garcia
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: