Resolution: Not a Bug
Description of problem:
- A broken catalogsource is not allowing to install an independent operator which is part of another catalogsource. - After deleting a broken catalogsource, able to install an independent operator. - This could have a big impact on the cluster in case that something is broken and have not seen it.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Create broken catalogsource $ cat catsrc-test.yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: simple-broken-catsource namespace: openshift-marketplace spec: image: registry/image:v1 sourceType: grpc updateStrategy: registryPoll: interval: 1440m $ oc apply -f catsrc-test.yaml ______________________________________________ 2. Create subscription $ cat sub-test.yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: simple-broken-subscription namespace: openshift-operators spec: channel: stable name: simple-broken-subscription source: simple-broken-catsource sourceNamespace: openshift-marketplace $ oc apply -f sub-test.yaml ______________________________________________ couldn't retrieve grafana sub from web console, reported: ~~~ Error getting YAML: YAMLException: unacceptable kind of an object to dump [object Undefined] ~~~ grafana sub as pulled from API: ~~~ - apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: creationTimestamp: "2023-09-06T20:33:52Z" generation: 1 labels: operators.coreos.com/grafana-operator.grafana-tester-2: "" name: grafana-operator namespace: grafana-tester-2 resourceVersion: "2107267" uid: c0cc6b6f-3b3c-4e59-93a1-dba23264e722 spec: channel: v4 installPlanApproval: Automatic name: grafana-operator source: community-operators sourceNamespace: openshift-marketplace startingCSV: grafana-operator.v4.10.1 ... conditions: - lastTransitionTime: "2023-09-06T20:33:52Z" message: all available catalogsources are healthy reason: AllCatalogSourcesHealthy status: "False" type: CatalogSourcesUnhealthy - message: 'failed to populate resolver cache from source simple-broken-catsource/openshift-marketplace: failed to list bundles: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp xxx.xx.xxx.xx:50051: connect: connection refused"' reason: ErrorPreventedResolution status: "True" type: ResolutionFailed lastUpdated: "2023-09-06T20:33:52Z" ~~~ - It looks like when a CatSrc pod specifically fails due to the source image being missing, the OLM fails to update Subscriptions of any sort. Once faulty CatSrc was removed, reinstalling Subscription resulted in successful installation.
Actual results:
A broken catalogsource is not allowing to install an independent operator
Expected results:
A broken catalogsource should not have impact on another independent operator of an another catalogsource.
Additional info:
- is cloned by
OCPBUGS-18817 [Doc] A broken catalogsource is impacting an independent operator of another catalogsource
- Closed