Description of problem:
Operator installation/upgrade fails stating: "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline"
Version-Release number of selected component (if applicable):
4.10
How reproducible:
oc -n openshift-marketplace get job 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e -o yaml apiVersion: batch/v1 kind: Job metadata: creationTimestamp: "2022-08-04T12:54:19Z" generation: 1 labels: controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8 job-name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e namespace: openshift-marketplace ownerReferences: - apiVersion: v1 blockOwnerDeletion: false controller: false kind: ConfigMap name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e uid: 2d6d332d-e680-4828-b97f-e6024b34575b resourceVersion: "1299311475" uid: e236f157-ab03-4153-b095-b6b1a97ef3c8 spec: activeDeadlineSeconds: 600 backoffLimit: 3 completionMode: NonIndexed completions: 1 parallelism: 1 selector: matchLabels: controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8 suspend: false template: metadata: creationTimestamp: null labels: controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8 job-name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e spec: containers: - command: - opm - alpha - bundle - extract - -m - /bundle/ - -n - openshift-marketplace - -c - 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e - -z env: - name: CONTAINER_IMAGE value: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8de7a35f7ca26e678b8e3d8bf5fa6aa80b84287413247dc031a785d0d139698c imagePullPolicy: IfNotPresent name: extract resources: requests: cpu: 10m memory: 50Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /bundle name: bundle dnsPolicy: ClusterFirst initContainers: - command: - /bin/cp - -Rv - /bin/cpb - /util/cpb image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cc477d763835d8c874b050223261dde5bcd73429f0cb55aa7f7cde3df892ce0f imagePullPolicy: IfNotPresent name: util resources: requests: cpu: 10m memory: 50Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /util name: util - command: - /util/cpb - /bundle image: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca imagePullPolicy: Always name: pull resources: requests: cpu: 10m memory: 50Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /bundle name: bundle - mountPath: /util name: util restartPolicy: Never schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 volumes: - emptyDir: {} name: bundle - emptyDir: {} name: util status: conditions: - lastProbeTime: "2022-08-04T13:04:19Z" lastTransitionTime: "2022-08-04T13:04:19Z" message: Job was active longer than specified deadline reason: DeadlineExceeded status: "True" type: Failed failed: 1 startTime: "2022-08-04T12:54:19Z" oc -n openshift-logging get installplan install-qzrfp -o yaml apiVersion: operators.coreos.com/v1alpha1 kind: InstallPlan metadata: creationTimestamp: "2022-08-04T12:54:19Z" generateName: install- generation: 1 labels: operators.coreos.com/cluster-logging.openshift-logging: "" name: install-qzrfp namespace: openshift-logging ownerReferences: - apiVersion: operators.coreos.com/v1alpha1 blockOwnerDeletion: false controller: false kind: Subscription name: cluster-logging-subscription uid: 48580ca3-bd57-449e-84ec-84efc8c8035d resourceVersion: "1299311512" uid: cd93ba60-b8db-448f-9239-1c8b15059eef spec: approval: Automatic approved: true clusterServiceVersionNames: - cluster-logging.5.4.4 generation: 26 status: bundleLookups: - catalogSourceRef: name: redhat-operators namespace: openshift-marketplace conditions: - message: bundle contents have not yet been persisted to installplan status reason: BundleNotUnpacked status: "True" type: BundleLookupNotPersisted - lastTransitionTime: "2022-08-04T12:54:19Z" message: 'unpack job not completed: Unpack pod(openshift-marketplace/14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4d5l7rv) container(pull) is pending. Reason: ImagePullBackOff, Message: Back-off pulling image "registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca"' reason: JobIncomplete status: "True" type: BundleLookupPending - lastTransitionTime: "2022-08-04T13:04:20Z" message: Job was active longer than specified deadline reason: DeadlineExceeded status: "True" type: BundleLookupFailed identifier: cluster-logging.5.4.4 path: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca properties: '{"properties":[{"type":"olm.package","value":{"packageName":"cluster-logging","version":"5.4.4"}},{"type":"olm.maxOpenShiftVersion","value":"4.11"},{"type":"olm.gvk","value":{"group":"logging.openshift.io","kind":"ClusterLogForwarder","version":"v1"}},{"type":"olm.gvk","value":{"group":"logging.openshift.io","kind":"ClusterLogging","version":"v1"}}]}' replaces: cluster-logging.5.4.3 catalogSources: [] conditions: - lastTransitionTime: "2022-08-04T13:04:20Z" lastUpdateTime: "2022-08-04T13:04:20Z" message: 'Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline' reason: InstallCheckFailed status: "False" type: Installed phase: Failed The solution from https://access.redhat.com/solutions/6459071 works and helps to eventually complete the Operator upgrade. But it's rather nasty if this kind of activtiy needs to be done on +10 OpenShift Container Platform 4 - Cluster and it's therefore requested to further investigate the root cause and make the overall process more robust.
Steps to Reproduce:
Seen often when upgrading Operators
Actual results:
Operator upgrade is failing and steps from https://access.redhat.com/solutions/6459071 needs to be applied to resume and eventually complete the upgrade
Expected results:
Operator upgrade should complete as expected without hitting problem even when there are certain resource or networking constrains. The timeout should be big enough to cope with many different situation/conditon and otherwise should report what is causing the problem.
Additional info:
https://access.redhat.com/solutions/6459071 Around 100+ cases have used above article to resolve this issue and a large number of people are affected.
- clones
-
OCPBUGS-6771 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.15.z
- Closed
- depends on
-
OCPBUGS-6771 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.15.z
- Closed
- duplicates
-
OCPBUGS-35021 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.12.z
- Closed
- links to
-
RHBA-2024:0941 OpenShift Container Platform 4.14.z bug fix update