-
Bug
-
Resolution: Done
-
Major
-
None
-
4.16
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Customer can't install the IBM DataPower Cloud Agent Operator or any operator in the openshift-operators namespace
The subscription for this operator is showing the error:
$ omc get subs -n openshift-operators ibm-dpod-cloud-agent-operator -oyaml ... conditions: - lastTransitionTime: "2025-03-26T21:25:45Z" message: all available catalogsources are healthy reason: AllCatalogSourcesHealthy status: "False" type: CatalogSourcesUnhealthy - message: 'bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline' reason: BundleUnpackFailed status: "True" type: BundleUnpackFailed lastUpdated: "2025-03-26T21:25:45Z"
We followed the workaround mentioned here: https://access.redhat.com/solutions/6459071 about removing the Job and ConfigMap associated with this operator, but the issue persists
Also, since the cluster is running v4.16 it shouldn't be affected by the bug that is mentioned on the bug (https://issues.redhat.com/browse/OCPBUGS-6771 )
The issue persists if the operator is installed either via WebConsole or CLI. And I also found this event in the openshift-marketplace related to the DPOD operator:
$ omc get ev -n openshift-marketplace | grep -i failed
38m Warning FailedToUpdateEndpoint endpoints/ibm-dpod-cloud-agent-catalog Failed to update endpoint openshift-marketplace/ibm-dpod-cloud-agent-catalog: Operation cannot be fulfilled on endpoints "ibm-dpod-cloud-agent-catalog": StorageError: invalid object, Code: 4, Key: /kubernetes.io/services/endpoints/openshift-marketplace/ibm-dpod-cloud-agent-catalog, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: a3875276-537b-4bde-af9a-59a23f7ede54, UID in object meta:
Something very odd that we found is that ALL the operators installed on the openshift-operators namespace are reporting the same bundle unpacking failed. Reason: DeadlineExceeded error: for example
$ omc get subs -n openshift-operators NAME PACKAGE SOURCE CHANNEL datapower-operator-v1.6-ibm-operator-catalog-openshift-marketplace datapower-operator ibm-operator-catalog v1.6 devspaces devspaces redhat-operators stable ibm-apiconnect ibm-apiconnect ibm-operator-catalog v3.8 ibm-common-service-operator-v3.23-ibm-operator-catalog-openshift-marketplace ibm-common-service-operator ibm-operator-catalog v3.23 ibm-dpod-cloud-agent-operator dpod-cloud-agent-operator ibm-dpod-cloud-agent-catalog stable-v1.2 open-liberty-certified open-liberty-certified certified-operators v1.4 postgresql postgresql community-operators v5
- Devspaces:
$ omc get subs -n openshift-operators devspaces -oyaml | grep conditions: -A20
conditions:
- lastTransitionTime: "2025-03-26T21:11:18Z" message: all available catalogsources are healthy
reason: AllCatalogSourcesHealthy
status: "False" type: CatalogSourcesUnhealthy - message: 'bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job
was active longer than specified deadline'
reason: BundleUnpackFailed
status: "True" type: BundleUnpackFailed - reason: UnpackingInProgress
status: "True" type: BundleUnpacking
lastUpdated: "2025-03-26T21:11:18Z"
- ibm-apiconnect:
$ omc get subs -n openshift-operators devspaces -oyaml | grep conditions: -A20
conditions:
- lastTransitionTime: "2025-03-26T21:11:18Z" message: all available catalogsources are healthy
reason: AllCatalogSourcesHealthy
status: "False" type: CatalogSourcesUnhealthy - message: 'bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job
was active longer than specified deadline'
reason: BundleUnpackFailed
status: "True" type: BundleUnpackFailed - reason: UnpackingInProgress
status: "True" type: BundleUnpacking
lastUpdated: "2025-03-26T21:11:18Z"
- PostgreSQL
$ omc get subs -n openshift-operators postgresql -oyaml conditions: - lastTransitionTime: "2025-03-26T21:10:59Z" message: all available catalogsources are healthy reason: AllCatalogSourcesHealthy status: "False" type: CatalogSourcesUnhealthy - reason: UnpackingInProgress status: "True" type: BundleUnpacking - message: 'bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline' reason: BundleUnpackFailed status: "True" type: BundleUnpackFailed - message: 'error using catalogsource openshift-marketplace/ibm-operator-catalog: failed to list bundles: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 10.65.46.166:50051: connect: connection timed out"' reason: ErrorPreventedResolution status: "True" type: ResolutionFailed
Additionally, I can see that all the installed operators are also reporting this message
message: 'error using catalogsource openshift-marketplace/ibm-operator-catalog:
failed to list bundles: rpc error: code = Unavailable desc = connection error:
desc = "transport: Error while dialing: dial tcp 10.65.46.166:50051: connect:
connection timed out"'
I thought this was an issue with the ibm-operator-catalog CatalogSource that could be affecting the rest of the operators, I asked the customer to test connectivity to this catalog but he mentions that there is no connectivity issue:
# oc debug node/aroinfarodeveast001-rnktw-worker-eastus3-nwbd2 sh-5.1# curl -v telnet://10.65.46.166:50051 curl -v telnet://10.65.46.166:50051 * Trying 10.65.46.166:50051... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Connected to 10.65.46.166 (10.65.46.166) port 50051 (#0)
Version-Release number of selected component (if applicable):
4.16.15
How reproducible:
Install any operator from the OperatorHub or via CLI and the BundleUnpackFailed error will be present