Uploaded image for project: 'Operator Runtime'
  1. Operator Runtime
  2. OPRUN-3268

Impact statement request for OCPBUGS-24009 OLM Operator packageserver Reporting Unavailable on InstallComponentFailed

XMLWordPrintable

    • Icon: Spike Spike
    • Resolution: Duplicate
    • Icon: Critical Critical
    • None
    • None
    • False
    • None
    • False
    • 0

      Impact assessment of OCPBUGS-24009

      Which 4.y.z to 4.y'.z' updates increase vulnerability?

      Any upgrade up to 4.15.{current-z}

      Which types of clusters?

      Any non-Microshift cluster with an operator installed via OLM before upgrade to 4.15. After upgrading to 4.15, re-installing a previously uninstalled operator may also cause this issue. 

      What is the impact? Is it serious enough to warrant removing update recommendations?

      OLM Operators can't be upgraded and may incorrectly report failed status.

      How involved is remediation?

      Delete the resources associated with the OLM installation related to the failure message in the olm-operator.

      A failure message similar to this may appear on the CSV:

      InstallComponentFailed install strategy failed: rolebindings.rbac.authorization.k8s.io "openshift-gitops-operator-controller-manager-service-auth-reader" already exists

      The following resource types have been observed to encounter this issue and should be safe to delete:

      • ClusterRoleBinding suffixed with "-system:auth-delegator"
      • Service
      • RoleBinding suffixed with "-auth-reader"

      Under no circumstances should a user delete a CustomResourceDefinition (CRD) if the same error occurs and names such a resource as data loss may occur. Note that we have not seen this type of resource named in the error from any of our users so far.

      Labeling the problematic resources with olm.managed: "true" then restarting the olm-operator pod in the openshift-operator-lifecycle-manager namespace may also resolve the issue if the resource appears risky to delete.

      Is this a regression?

      Yes, functionality which worked in 4.14 may break after upgrading to 4.15.Not a regression, this is a new issue related to performance improvements added to OLM in 4.15

      https://issues.redhat.com/browse/OCPBUGS-24009

      https://issues.redhat.com/browse/OCPBUGS-31080

      https://issues.redhat.com/browse/OCPBUGS-28845

            rh-ee-dfranz Daniel Franz
            afri@afri.cz Petr Muller
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: