Description of problem:
Operator installation/upgrade fails stating: "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline"
Version-Release number of selected component (if applicable):
4.10
How reproducible:
oc -n openshift-marketplace get job 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e -o yaml
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: "2022-08-04T12:54:19Z"
generation: 1
labels:
controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
job-name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
namespace: openshift-marketplace
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: false
controller: false
kind: ConfigMap
name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
uid: 2d6d332d-e680-4828-b97f-e6024b34575b
resourceVersion: "1299311475"
uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
spec:
activeDeadlineSeconds: 600
backoffLimit: 3
completionMode: NonIndexed
completions: 1
parallelism: 1
selector:
matchLabels:
controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
suspend: false
template:
metadata:
creationTimestamp: null
labels:
controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
job-name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
spec:
containers:
- command:
- opm
- alpha
- bundle
- extract
- -m
- /bundle/
- -n
- openshift-marketplace
- -c
- 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
- -z
env:
- name: CONTAINER_IMAGE
value: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8de7a35f7ca26e678b8e3d8bf5fa6aa80b84287413247dc031a785d0d139698c
imagePullPolicy: IfNotPresent
name: extract
resources:
requests:
cpu: 10m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /bundle
name: bundle
dnsPolicy: ClusterFirst
initContainers:
- command:
- /bin/cp
- -Rv
- /bin/cpb
- /util/cpb
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cc477d763835d8c874b050223261dde5bcd73429f0cb55aa7f7cde3df892ce0f
imagePullPolicy: IfNotPresent
name: util
resources:
requests:
cpu: 10m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /util
name: util
- command:
- /util/cpb
- /bundle
image: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
imagePullPolicy: Always
name: pull
resources:
requests:
cpu: 10m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /bundle
name: bundle
- mountPath: /util
name: util
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: bundle
- emptyDir: {}
name: util
status:
conditions:
- lastProbeTime: "2022-08-04T13:04:19Z"
lastTransitionTime: "2022-08-04T13:04:19Z"
message: Job was active longer than specified deadline
reason: DeadlineExceeded
status: "True"
type: Failed
failed: 1
startTime: "2022-08-04T12:54:19Z"
oc -n openshift-logging get installplan install-qzrfp -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: InstallPlan
metadata:
creationTimestamp: "2022-08-04T12:54:19Z"
generateName: install-
generation: 1
labels:
operators.coreos.com/cluster-logging.openshift-logging: ""
name: install-qzrfp
namespace: openshift-logging
ownerReferences:
- apiVersion: operators.coreos.com/v1alpha1
blockOwnerDeletion: false
controller: false
kind: Subscription
name: cluster-logging-subscription
uid: 48580ca3-bd57-449e-84ec-84efc8c8035d
resourceVersion: "1299311512"
uid: cd93ba60-b8db-448f-9239-1c8b15059eef
spec:
approval: Automatic
approved: true
clusterServiceVersionNames:
- cluster-logging.5.4.4
generation: 26
status:
bundleLookups:
- catalogSourceRef:
name: redhat-operators
namespace: openshift-marketplace
conditions:
- message: bundle contents have not yet been persisted to installplan status
reason: BundleNotUnpacked
status: "True"
type: BundleLookupNotPersisted
- lastTransitionTime: "2022-08-04T12:54:19Z"
message: 'unpack job not completed: Unpack pod(openshift-marketplace/14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4d5l7rv)
container(pull) is pending. Reason: ImagePullBackOff, Message: Back-off pulling
image "registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca"'
reason: JobIncomplete
status: "True"
type: BundleLookupPending
- lastTransitionTime: "2022-08-04T13:04:20Z"
message: Job was active longer than specified deadline
reason: DeadlineExceeded
status: "True"
type: BundleLookupFailed
identifier: cluster-logging.5.4.4
path: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
properties: '{"properties":[{"type":"olm.package","value":{"packageName":"cluster-logging","version":"5.4.4"}},{"type":"olm.maxOpenShiftVersion","value":"4.11"},{"type":"olm.gvk","value":{"group":"logging.openshift.io","kind":"ClusterLogForwarder","version":"v1"}},{"type":"olm.gvk","value":{"group":"logging.openshift.io","kind":"ClusterLogging","version":"v1"}}]}'
replaces: cluster-logging.5.4.3
catalogSources: []
conditions:
- lastTransitionTime: "2022-08-04T13:04:20Z"
lastUpdateTime: "2022-08-04T13:04:20Z"
message: 'Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job
was active longer than specified deadline'
reason: InstallCheckFailed
status: "False"
type: Installed
phase: Failed
The solution from https://access.redhat.com/solutions/6459071 works and helps to eventually complete the Operator upgrade. But it's rather nasty if this kind of activtiy needs to be done on +10 OpenShift Container Platform 4 - Cluster and it's therefore requested to further investigate the root cause and make the overall process more robust.
Steps to Reproduce:
Seen often when upgrading Operators
Actual results:
Operator upgrade is failing and steps from https://access.redhat.com/solutions/6459071 needs to be applied to resume and eventually complete the upgrade
Expected results:
Operator upgrade should complete as expected without hitting problem even when there are certain resource or networking constrains. The timeout should be big enough to cope with many different situation/conditon and otherwise should report what is causing the problem.
Additional info:
https://access.redhat.com/solutions/6459071 Around 100+ cases have used above article to resolve this issue and a large number of people are affected.
- clones
-
OCPBUGS-6771 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.15.z
-
- Closed
-
- depends on
-
OCPBUGS-6771 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.15.z
-
- Closed
-
- duplicates
-
OCPBUGS-35021 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.12.z
-
- Closed
-
- links to
-
RHBA-2024:0941
OpenShift Container Platform 4.14.z bug fix update