[OCPBUGS-29194] Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.14.z - Red Hat Issue Tracker

Type: Bug
Resolution: Done-Errata
Priority: Major
Fix Version/s: 4.14.z
Affects Version/s: 4.13.z, 4.12.z, 4.11.z, 4.10.z, 4.14.z
Component/s: OLM
Labels:
- OLM
- cee.neXT
- opeco-triaged
- triaged

Severity:
Moderate
Regression:
None
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Internal Whiteboard:
RH Private Keywords:
Target Version:

4.14.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:
PX Review Complete:

Description of problem:

Operator installation/upgrade fails stating: "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline"

Version-Release number of selected component (if applicable):

4.10

How reproducible:

oc -n openshift-marketplace get job 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e -o yaml
apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: "2022-08-04T12:54:19Z"
  generation: 1
  labels:
    controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
    job-name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
  name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
  namespace: openshift-marketplace
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: false
    controller: false
    kind: ConfigMap
    name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
    uid: 2d6d332d-e680-4828-b97f-e6024b34575b
  resourceVersion: "1299311475"
  uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
spec:
  activeDeadlineSeconds: 600
  backoffLimit: 3
  completionMode: NonIndexed
  completions: 1
  parallelism: 1
  selector:
    matchLabels:
      controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
  suspend: false
  template:
    metadata:
      creationTimestamp: null
      labels:
        controller-uid: e236f157-ab03-4153-b095-b6b1a97ef3c8
        job-name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
      name: 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
    spec:
      containers:
      - command:
        - opm
        - alpha
        - bundle
        - extract
        - -m
        - /bundle/
        - -n
        - openshift-marketplace
        - -c
        - 14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4dec25e
        - -z
        env:
        - name: CONTAINER_IMAGE
          value: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8de7a35f7ca26e678b8e3d8bf5fa6aa80b84287413247dc031a785d0d139698c
        imagePullPolicy: IfNotPresent
        name: extract
        resources:
          requests:
            cpu: 10m
            memory: 50Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /bundle
          name: bundle
      dnsPolicy: ClusterFirst
      initContainers:
      - command:
        - /bin/cp
        - -Rv
        - /bin/cpb
        - /util/cpb
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cc477d763835d8c874b050223261dde5bcd73429f0cb55aa7f7cde3df892ce0f
        imagePullPolicy: IfNotPresent
        name: util
        resources:
          requests:
            cpu: 10m
            memory: 50Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /util
          name: util
      - command:
        - /util/cpb
        - /bundle
        image: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
        imagePullPolicy: Always
        name: pull
        resources:
          requests:
            cpu: 10m
            memory: 50Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /bundle
          name: bundle
        - mountPath: /util
          name: util
      restartPolicy: Never
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - emptyDir: {}
        name: bundle
      - emptyDir: {}
        name: util
status:
  conditions:
  - lastProbeTime: "2022-08-04T13:04:19Z"
    lastTransitionTime: "2022-08-04T13:04:19Z"
    message: Job was active longer than specified deadline
    reason: DeadlineExceeded
    status: "True"
    type: Failed
  failed: 1
  startTime: "2022-08-04T12:54:19Z"


oc -n openshift-logging get installplan install-qzrfp -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: InstallPlan
metadata:
  creationTimestamp: "2022-08-04T12:54:19Z"
  generateName: install-
  generation: 1
  labels:
    operators.coreos.com/cluster-logging.openshift-logging: ""
  name: install-qzrfp
  namespace: openshift-logging
  ownerReferences:
  - apiVersion: operators.coreos.com/v1alpha1
    blockOwnerDeletion: false
    controller: false
    kind: Subscription
    name: cluster-logging-subscription
    uid: 48580ca3-bd57-449e-84ec-84efc8c8035d
  resourceVersion: "1299311512"
  uid: cd93ba60-b8db-448f-9239-1c8b15059eef
spec:
  approval: Automatic
  approved: true
  clusterServiceVersionNames:
  - cluster-logging.5.4.4
  generation: 26
status:
  bundleLookups:
  - catalogSourceRef:
      name: redhat-operators
      namespace: openshift-marketplace
    conditions:
    - message: bundle contents have not yet been persisted to installplan status
      reason: BundleNotUnpacked
      status: "True"
      type: BundleLookupNotPersisted
    - lastTransitionTime: "2022-08-04T12:54:19Z"
      message: 'unpack job not completed: Unpack pod(openshift-marketplace/14359dfdd866df54d278e75b42202a5af9ce0cefdf416216dd11e09e4d5l7rv)
        container(pull) is pending. Reason: ImagePullBackOff, Message: Back-off pulling
        image "registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca"'
      reason: JobIncomplete
      status: "True"
      type: BundleLookupPending
    - lastTransitionTime: "2022-08-04T13:04:20Z"
      message: Job was active longer than specified deadline
      reason: DeadlineExceeded
      status: "True"
      type: BundleLookupFailed
    identifier: cluster-logging.5.4.4
    path: registry.redhat.io/openshift-logging/cluster-logging-operator-bundle@sha256:d19c4b7b67a70b46b6b3ac43b2f285cc19c52f2795c8dfbea4315bd06e7485ca
    properties: '{"properties":[{"type":"olm.package","value":{"packageName":"cluster-logging","version":"5.4.4"}},{"type":"olm.maxOpenShiftVersion","value":"4.11"},{"type":"olm.gvk","value":{"group":"logging.openshift.io","kind":"ClusterLogForwarder","version":"v1"}},{"type":"olm.gvk","value":{"group":"logging.openshift.io","kind":"ClusterLogging","version":"v1"}}]}'
    replaces: cluster-logging.5.4.3
  catalogSources: []
  conditions:
  - lastTransitionTime: "2022-08-04T13:04:20Z"
    lastUpdateTime: "2022-08-04T13:04:20Z"
    message: 'Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job
      was active longer than specified deadline'
    reason: InstallCheckFailed
    status: "False"
    type: Installed
  phase: Failed

The solution from https://access.redhat.com/solutions/6459071 works and helps to eventually complete the Operator upgrade. But it's rather nasty if this kind of activtiy needs to be done on +10 OpenShift Container Platform 4 - Cluster and it's therefore requested to further investigate the root cause and make the overall process more robust.

Steps to Reproduce:

Seen often when upgrading Operators

Actual results:

Operator upgrade is failing and steps from https://access.redhat.com/solutions/6459071 needs to be applied to resume and eventually complete the upgrade

Expected results:

Operator upgrade should complete as expected without hitting problem even when there are certain resource or networking constrains. The timeout should be big enough to cope with many different situation/conditon and otherwise should report what is causing the problem.

Additional info:

https://access.redhat.com/solutions/6459071

Around 100+ cases have used above article to resolve this issue and a large number of people are affected.

clones

OCPBUGS-6771 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.15.z

Closed

depends on

OCPBUGS-6771 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.15.z

Closed

duplicates

OCPBUGS-35021 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.12.z

Closed

links to

openshift/operator-framework-olm#689: [release-4.14] OCPBUGS-29194: Retry failing unpack jobs

RHBA-2024:0941 OpenShift Container Platform 4.14.z bug fix update

Assignee:: Ankita Thomas

Reporter:: David Hernandez Fernandez

QA Contact:: Xia Zhao

Need Info From:: Daniel Messer

Votes:: 0 Vote for this issue

Watchers:: 13 Start watching this issue

Created:: 2024/02/07 5:02 PM

Updated:: 2024/08/07 2:24 PM

Resolved:: 2024/02/28 12:21 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates