-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.14.z, 4.15.z, 4.17.z, 4.16.z, 4.18
-
Important
-
None
-
Eevee OLM Sprint 265, Flareon OLM Sprint 266
-
2
-
Rejected
-
False
-
-
Addresses issue where by concurrent reconciliation of the same namespace was leading to erroneous terminal states on Subscriptions
-
Bug Fix
-
In Progress
Description of problem:
When installing ROSA/OSD operators OLM "locks up" the Subscription object with "ConstraintsNotSatisfiable" 3-15% of the times, depending on the environment.
Version-Release number of selected component (if applicable):
Recently tested on: - OSD 4.17.5 - 4.18 nightly (from cluster bot) Though prevalence across the ROSA fleet suggests this is not a new issue.
How reproducible:
Very. This is very prevalent across the OSD/ROSA Classic cluster fleet. Any new OSD/ROSA Classic cluster has a good chance of at least one of its ~12 OSD-specific operators being affected on install time.
Steps to Reproduce:
0. Set up a cluster using cluster bot. 1. Label at least one worker node with node-role.kubernetes.io=infra 2. Install must gather operator with "oc apply -f mgo.yaml" (file attached) 3. Wait for the pods to come up. 4. Start this loop: for i in `seq -w 999`; do echo -ne ">>>>>>> $i\t\t"; date; oc get -n openshift-must-gather-operator subscription/must-gather-operator -o yaml >mgo-sub-$i.yaml; oc delete -f mgo.yaml; oc apply -f mgo.yaml; sleep 100; done 5. Let it run for a few hours.
Actual results:
Run "grep ConstraintsNotSatisfiable *.yaml" You should find a few of the Subscriptions ended up in a "locked" state from which there is no upgrade without manual intervention: - message: 'constraints not satisfiable: @existing/openshift-must-gather-operator//must-gather-operator.v4.17.281-gd5416c9 and must-gather-operator-registry/openshift-must-gather-operator/stable/must-gather-operator.v4.17.281-gd5416c9 originate from package must-gather-operator, subscription must-gather-operator requires must-gather-operator-registry/openshift-must-gather-operator/stable/must-gather-operator.v4.17.281-gd5416c9, subscription must-gather-operator exists, clusterserviceversion must-gather-operator.v4.17.281-gd5416c9 exists and is not referenced by a subscription' reason: ConstraintsNotSatisfiable status: "True" type: ResolutionFailed
Expected results:
Each installation attempt should've worked fine.
Additional info:
mgo.yaml:
apiVersion: v1 kind: Namespace metadata: name: openshift-must-gather-operator annotations: package-operator.run/collision-protection: IfNoController package-operator.run/phase: namespaces openshift.io/node-selector: "" labels: openshift.io/cluster-logging: "true" openshift.io/cluster-monitoring: 'true' --- apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: must-gather-operator-registry namespace: openshift-must-gather-operator annotations: package-operator.run/collision-protection: IfNoController package-operator.run/phase: must-gather-operator labels: opsrc-datastore: "true" opsrc-provider: redhat spec: image: quay.io/app-sre/must-gather-operator-registry@sha256:0a0610e37a016fb4eed1b000308d840795838c2306f305a151c64cf3b4fd6bb4 displayName: must-gather-operator icon: base64data: '' mediatype: '' publisher: Red Hat sourceType: grpc grpcPodConfig: securityContextConfig: restricted nodeSelector: node-role.kubernetes.io: infra tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra operator: Exists --- apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: must-gather-operator namespace: openshift-must-gather-operator annotations: package-operator.run/collision-protection: IfNoController package-operator.run/phase: must-gather-operator spec: channel: stable name: must-gather-operator source: must-gather-operator-registry sourceNamespace: openshift-must-gather-operator --- apiVersion: operators.coreos.com/v1alpha2 kind: OperatorGroup metadata: name: must-gather-operator namespace: openshift-must-gather-operator annotations: package-operator.run/collision-protection: IfNoController package-operator.run/phase: must-gather-operator olm.operatorframework.io/exclude-global-namespace-resolution: 'true' spec: targetNamespaces: - openshift-must-gather-operator
- depends on
-
OCPBUGS-48661 When installing an operator OLM locks the Subscription 3-15% of the times [release-4.16]
- Closed
- is depended on by
-
OCPBUGS-48663 When installing an operator OLM locks the Subscription 3-15% of the times [release-4.14]
- ASSIGNED
- links to