-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.9.z
-
Moderate
-
No
-
Rejected
-
False
-
-
This is a clone of issue OCPBUGS-18305. The following is the description of the original issue:
—
Description of problem:
It appears it may be possible to have invalid CSV entries in the resolver cache, resulting in the inability to reinstall an Operator. The situation: -------------- A customer has removed the CSV, InstallPlan and Subscription for the GitOps Operator from the cluster but upon attempting to reinstall the Operator, the OLM was providing a conflict with existing CSV. This CSV was not in the ETCD instance and was removed previously. Upon deleting the `operator-catalog` and `operator-lifecycle-manager` Pods, the collision was resolved and the Operator was able to installed again. ~~~ 'Warning' reason: 'ResolutionFailed' constraints not satisfiable: subscription openshift-gitops-operator exists, subscription openshift-gitops-operator requires redhat-operators/openshift-marketplace/stable/openshift-gitops-operator.v1.5.8, redhat-operators/openshift-marketplace/stable/openshift-gitops-operator.v1.5.8 and @existing/openshift-operators//openshift-gitops-operator.v1.5.6-0.1664915551.p originate from package openshift-gitops-operator, clusterserviceversion openshift-gitops-operator.v1.5.6-0.1664915551.p exists and is not referenced by a subscription ~~~
Version-Release number of selected component (if applicable):
4.9.31
How reproducible:
Very intermittent, however once the issue has occurred it was impossible to avoid without deleting the Pods.
Steps to Reproduce:
1. Add Operator with manual approval InstallPlan 2. Remove Operator (Subscription, CSV, InstallPlan) 3. Attempt to reinstall Operator
Actual results:
Very intermittent failure
Expected results:
Operators do not have conflicts with CSVs which have already been removed.
Additional info:
Briefly reviewing the OLM code, it appears an internal resolver cache is populated and used for checking constraints when an operator is installed. If there are stale entries in the cache, this would result in the described issue. The cache appears to have been rearchitected (moved to a dedicated object) since OCP 4.9.31. Due to the nature of this issue, the request does not have clear reproduction steps to reproduce so if the issue is unable to reproduced, I would like instructions on how to dump the contents of the cache if the issue is to arise again.
- blocks
-
OCPBUGS-18511 Stale CSV Entries in the Resolver Cache for Operator Lifecycle Manager
- Verified
- clones
-
OCPBUGS-18512 Stale CSV Entries in the Resolver Cache for Operator Lifecycle Manager
- Closed
- is blocked by
-
OCPBUGS-18512 Stale CSV Entries in the Resolver Cache for Operator Lifecycle Manager
- Closed
- is cloned by
-
OCPBUGS-18511 Stale CSV Entries in the Resolver Cache for Operator Lifecycle Manager
- Verified
- links to
-
RHBA-2023:5350 OpenShift Container Platform 4.11.z bug fix update