-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
4.10
-
Moderate
-
None
-
OPECO 233
-
1
-
Rejected
-
x86_64
-
If docs needed, set a value
-
Description of problem:
Checking https://bugzilla.redhat.com/show_bug.cgi?id=1921264 and https://bugzilla.redhat.com/show_bug.cgi?id=2014308 it seems that the problem with `Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline` during Operator upgrade should be resolved or no longer happening.
But we are still seeing the error reported on OpenShift Container Platform 4.10.15 and 4.10.24, recently especially during Cluster Logging 5.4.4 updates.
oc -n openshift-marketplace get job 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e -o yaml
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: "2022-08-04T12:54:19Z"
generation: 1
labels:
controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
job-name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
namespace: openshift-marketplace
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: false
controller: false
kind: ConfigMap
name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
uid: 2d6d332d-e680-4828-b97f-e6024b34575b
resourceVersion: "1299311475"
uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
spec:
activeDeadlineSeconds: 600
backoffLimit: 3
completionMode: NonIndexed
completions: 1
parallelism: 1
selector:
matchLabels:
controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
suspend: false
template:
metadata:
creationTimestamp: null
labels:
controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
job-name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
spec:
containers: - command:
- opm
- alpha
- bundle
- extract
- -m
- /bundle/
- -n
- openshift-marketplace
- -c
- 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
- -z
env: - name: CONTAINER_IMAGE
value: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8de7a35f7ca26e678b8e3d8bf5fa6aa80b84287413247dc031a785d0d139698c
imagePullPolicy: IfNotPresent
name: extract
resources:
requests:
cpu: 10m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts: - mountPath: /bundle
name: bundle
dnsPolicy: ClusterFirst
initContainers: - command:
- /bin/cp
- -Rv
- /bin/cpb
- /util/cpb
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cc477d763835d8c874b050223261dde5bcd73429f0cb55aa7f7cde3df892ce0f
imagePullPolicy: IfNotPresent
name: util
resources:
requests:
cpu: 10m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts: - mountPath: /util
name: util - command:
- /util/cpb
- /bundle
image: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
imagePullPolicy: Always
name: pull
resources:
requests:
cpu: 10m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts: - mountPath: /bundle
name: bundle - mountPath: /util
name: util
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes: - emptyDir: {}
name: bundle - emptyDir: {}
name: util
status:
conditions: - lastProbeTime: "2022-08-04T13:04:19Z"
lastTransitionTime: "2022-08-04T13:04:19Z"
message: Job was active longer than specified deadline
reason: DeadlineExceeded
status: "True"
type: Failed
failed: 1
startTime: "2022-08-04T12:54:19Z"
oc -n openshift-logging get installplan install-qzrfp -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: InstallPlan
metadata:
creationTimestamp: "2022-08-04T12:54:19Z"
generateName: install-
generation: 1
labels:
operators.coreos.com/cluster-logging.openshift-logging: ""
name: install-qzrfp
namespace: openshift-logging
ownerReferences:
- apiVersion: operators.coreos.com/v1alpha1
blockOwnerDeletion: false
controller: false
kind: Subscription
name: cluster-logging-subscription
uid: 48580ca3-bd57-449e-84ec-84efc8c8035d
resourceVersion: "1299311512"
uid: cd93ba60-b8db-448f-9239-1c8b15059eef
spec:
approval: Automatic
approved: true
clusterServiceVersionNames: - cluster-logging.5.4.4
generation: 26
status:
bundleLookups: - catalogSourceRef:
name: redhat-operators
namespace: openshift-marketplace
conditions: - message: bundle contents have not yet been persisted to installplan status
reason: BundleNotUnpacked
status: "True"
type: BundleLookupNotPersisted - lastTransitionTime: "2022-08-04T12:54:19Z"
message: 'unpack job not completed: Unpack pod(openshift-marketplace/14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4d5l7rv)
container(pull) is pending. Reason: ImagePullBackOff, Message: Back-off pulling
image "registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca"'
reason: JobIncomplete
status: "True"
type: BundleLookupPending - lastTransitionTime: "2022-08-04T13:04:20Z"
message: Job was active longer than specified deadline
reason: DeadlineExceeded
status: "True"
type: BundleLookupFailed
identifier: cluster-logging.5.4.4
path: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
properties: '{"properties":[{"type":"olm.package","value":{"packageName":"cluster-logging","version":"5.4.4"}}, {"type":"olm.maxOpenShiftVersion","value":"4.11"},{"type":"olm.gvk","value":{"group":"logging.openshift.io","kind":"ClusterLogForwarder","version":"v1"}},{"type":"olm.gvk","value":{"group":"logging.openshift.io","kind":"ClusterLogging","version":"v1"}}]}'
replaces: cluster-logging.5.4.3
catalogSources: []
conditions: - lastTransitionTime: "2022-08-04T13:04:20Z"
lastUpdateTime: "2022-08-04T13:04:20Z"
message: 'Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job
was active longer than specified deadline'
reason: InstallCheckFailed
status: "False"
type: Installed
phase: Failed
The solution from https://access.redhat.com/solutions/6459071 works and helps to eventually complete the Operator upgrade. But it's rather nasty if this kind of activtiy needs to be done on +10 OpenShift Container Platform 4 - Cluster and it's therefore requested to further investigate the root cause and make the overall process more robust.
Version-Release number of selected component (if applicable):
- OpenShift Container Platform 4.10.15 and 4.10.24
How reproducible:
- Random/unclear
Steps to Reproduce:
1. Was seen rather often when Cluster Logging 5.4.4 was made available
Actual results:
Operator upgrade is failing and steps from https://access.redhat.com/solutions/6459071 needs to be applied to resume and eventually complete the upgrade
Expected results:
Operator upgrade should complete as expected without hitting problem even when there are certain resource or networking constrains. The timeout should be big enough to cope with many different situation/conditon and otherwise should report what is causing the problem.
Additional info:
- is related to
-
OCPBUGS-6771 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.15.z
- Closed