-
Bug
-
Resolution: Duplicate
-
Normal
-
None
-
4.11.z
-
None
-
Grumpy 241
-
1
-
Rejected
-
False
-
Description of problem:
While upgrading both platform and operators of 3423 SNOs, 9 clusters failed to upgrade any of their operators because each installplan is reporting "bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline"
Version-Release number of selected component (if applicable):
SNO OCP 4.10.32 (Clusters with issue) attempting to be upgraded to 4.11.5 Hub OCP 4.11.19 ACM Version - 2.7.0-DOWNSTREAM-2023-01-12-20-55-01 Operators being upgraded from the v4.10 to the v4.11 operators catalog
How reproducible:
9 out of 3423 cluster upgrades 9 of the 84 total failures
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
Example cluster sno00801:
# oc --kubeconfig=/root/hv-vm/sno/manifests/sno00801/kubeconfig get installplan -A NAMESPACE NAME CSV APPROVAL APPROVED openshift-local-storage install-7pj97 local-storage-operator.4.10.0-202212061900 Manual true openshift-local-storage install-xdwsj local-storage-operator.4.11.0-202212070335 Manual true openshift-logging install-nbxlj cluster-logging.5.5.5 Manual true openshift-ptp install-2mlg4 ptp-operator.4.11.0-202301031954 Manual true openshift-ptp install-m77t6 ptp-operator.4.10.0-202212072254 Manual true openshift-sriov-network-operator install-n2rvh sriov-network-operator.4.10.0-202212061900 Manual true openshift-sriov-network-operator install-rffhx sriov-network-operator.4.11.0-202212071535 Manual true # oc --kubeconfig=/root/hv-vm/sno/manifests/sno00801/kubeconfig get csv -A NAMESPACE NAME DISPLAY VERSION REPLACES PHASE openshift-local-storage local-storage-operator.4.10.0-202212061900 Local Storage 4.10.0-202212061900 Succeeded openshift-logging cluster-logging.5.5.5 Red Hat OpenShift Logging 5.5.5 Succeeded openshift-operator-lifecycle-manager packageserver Package Server 0.19.0 Succeeded openshift-ptp ptp-operator.4.10.0-202212072254 PTP Operator 4.10.0-202212072254 Succeeded openshift-sriov-network-operator sriov-network-operator.4.10.0-202212061900 SR-IOV Network Operator 4.10.0-202212061900 Succeeded
Note all approved installplans however none of the operator's CSVs are to the version expected
en looking at the openshift-marketplace namespace:
# oc --kubeconfig=/root/hv-vm/sno/manifests/sno00801/kubeconfig get po,job -n openshift-marketplace NAME READY STATUS RESTARTS AGE pod/5884db547fb0aebac3a93dece7eb6effaf706e25a01d46ce23f25ff2cffvg8r 0/1 Completed 0 3d17h pod/7ff2113420a370bd4ca107c3800ace5ad581a2335a355d77de3b9b6f5bqw6d5 0/1 Completed 0 3d17h pod/9d47955b1a539d0aa8d707cf941bfeef574528d0cb67b9270e0eb0aafcl25gj 0/1 Completed 0 3d17h pod/bffb5564f8c0b54d67e6a72a609648bfb70e05d03a6a7f9fc970a57451vdv8t 0/1 Completed 0 3d17h pod/e77265ee6f6e18fe1204e4bbec687b3929f866b3eb14dde98599ce3f74frhr5 0/1 Completed 0 3d17h pod/marketplace-operator-6fd78976f6-xfkzk 1/1 Running 2 2d5h pod/rh-du-operators-kjq9t 1/1 Running 0 2d4hNAME COMPLETIONS DURATION AGE job.batch/37e8a8637099e9504d1b0862d0efa22ad127781f6cd58ca0b950e996b853552 0/1 2d4h 2d4h job.batch/5884db547fb0aebac3a93dece7eb6effaf706e25a01d46ce23f25ff2cf5dcd2 1/1 11s 3d17h job.batch/7ff2113420a370bd4ca107c3800ace5ad581a2335a355d77de3b9b6f5b62769 1/1 44s 3d17h job.batch/9d47955b1a539d0aa8d707cf941bfeef574528d0cb67b9270e0eb0aafc30b21 1/1 9s 3d17h job.batch/bffb5564f8c0b54d67e6a72a609648bfb70e05d03a6a7f9fc970a574518c652 1/1 30s 3d17h job.batch/cf5c1fe37d69824891c1587cbe87cf59ff99e134dd96107ce8ad26b8af2c4b2 0/1 2d4h 2d4h job.batch/e4ec62041f18f79682d92d06da3b06b8a4aee8a1e30ecdf502839389363354b 0/1 2d4h 2d4h job.batch/e77265ee6f6e18fe1204e4bbec687b3929f866b3eb14dde98599ce3f74e7f8c 1/1 40s 3d17h
3 noncompleted jobs for the 3 operators we expect to be upgraded
And if we inspect the installplans we see:
# oc --kubeconfig=/root/hv-vm/sno/manifests/sno00801/kubeconfig get installplan -A -o json | jq '.items[] | "\(.metadata.namespace) \(.metadata.name) \(.status.conditions[] | .message)"' "openshift-local-storage install-7pj97 null" "openshift-local-storage install-xdwsj bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline" "openshift-logging install-nbxlj null" "openshift-ptp install-2mlg4 bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline" "openshift-ptp install-m77t6 null" "openshift-sriov-network-operator install-n2rvh null" "openshift-sriov-network-operator install-rffhx bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline"
- duplicates
-
OCPBUGS-6771 Operator installation/upgrade fails with "Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline - 4.15.z
- Closed
- links to