-
Bug
-
Resolution: Done-Errata
-
Normal
-
CNV v4.15.0
-
None
-
0.42
-
False
-
-
False
-
---
-
---
-
-
Storage Core Sprint 249, Storage Core Sprint 250, Storage Core Sprint 251
-
No
Description of problem:
After tier1 CDI OCS test suite, we see that the cronjob and it's job are not cleaned up. The pods are being recreated every *something seconds [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron cron-test-a6f5c71b-28351619-2prll 0/1 Error 0 5s cron-test-a6f5c71b-28351619-8wkgx 0/1 Error 0 37s cron-test-a6f5c71b-28351619-bkld5 0/1 Error 0 26s [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron cron-test-a6f5c71b-28351620-wzdgk 0/1 Error 0 3s [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron cron-test-a6f5c71b-28351620-j42l2 0/1 Error 0 19s cron-test-a6f5c71b-28351620-wzdgk 0/1 Error 0 30s [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron cron-test-a6f5c71b-28351623-h86lq 0/1 ContainerCreating 0 0s [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron cron-test-a6f5c71b-28351623-h86lq 0/1 Error 0 2s [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron cron-test-a6f5c71b-28351623-5gtlp 0/1 Error 0 21s cron-test-a6f5c71b-28351623-h86lq 0/1 Error 0 32s cron-test-a6f5c71b-28351623-svbfj 0/1 ContainerCreating 0 0s [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get dataimportcron --all-namespaces NAMESPACE NAME FORMAT openshift-virtualization-os-images centos-7-image-cron pvc openshift-virtualization-os-images centos-stream8-image-cron pvc openshift-virtualization-os-images centos-stream9-image-cron pvc openshift-virtualization-os-images fedora-image-cron pvc openshift-virtualization-os-images rhel8-image-cron pvc openshift-virtualization-os-images rhel9-image-cron pvc [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get cronjobs -n openshift-cnv NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE centos-7-image-cron-1e248498 7 4/12 * * * False 0 11h 3d centos-stream8-image-cron-9290d33a 7 4/12 * * * False 0 11h 3d centos-stream9-image-cron-188404e1 7 4/12 * * * False 0 11h 3d cron-test-a6f5c71b * * * * * False 0 36s 47h fedora-image-cron-9a1f2246 7 4/12 * * * False 0 11h 3d [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get jobs -n openshift-cnv NAME COMPLETIONS DURATION AGE cron-test-a6f5c71b-28351630 0/1 43s 43s [cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get cronjob -n openshift-cnv cron-test-a6f5c71b -oyaml apiVersion: batch/v1 kind: CronJob metadata: creationTimestamp: "2023-11-25T15:43:51Z" generation: 3 labels: app.kubernetes.io/component: storage app.kubernetes.io/managed-by: cdi-controller app.kubernetes.io/part-of: hyperconverged-cluster app.kubernetes.io/version: 4.14.1 cdi.kubevirt.io/dataImportCron: cdi-e2e-tests-dataimportcron-func-test-ch5s8.cron-test name: cron-test-a6f5c71b namespace: openshift-cnv ownerReferences: - apiVersion: cdi.kubevirt.io/v1beta1 blockOwnerDeletion: true controller: true kind: CDI name: cdi-kubevirt-hyperconverged uid: 974943ef-6c2d-498b-96cc-a3dc9a7fdd74 resourceVersion: "8733559" uid: db47a2f5-843d-445d-b374-592bdc589813 spec: concurrencyPolicy: Forbid failedJobsHistoryLimit: 1 jobTemplate: metadata: creationTimestamp: null spec: backoffLimit: 2 template: metadata: creationTimestamp: null spec: containers: - command: - /usr/bin/cdi-source-update-poller - -ns - cdi-e2e-tests-dataimportcron-func-test-ch5s8 - -cron - cron-test - -url - docker://cdi-docker-registry-host.openshift-cnv/tinycoreqcow2 env: - name: INSECURE_TLS value: "true" - name: http_proxy - name: https_proxy - name: no_proxy image: registry.redhat.io/container-native-virtualization/virt-cdi-importer-rhel9@sha256:75a9f754acba4cc158ebac58b161b70f964802a4ce9915cb20db413854af2830 imagePullPolicy: IfNotPresent name: cdi-source-update-poller resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL runAsNonRoot: true runAsUser: 107 seccompProfile: type: RuntimeDefault terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst restartPolicy: Never schedulerName: default-scheduler securityContext: {} serviceAccount: cdi-cronjob serviceAccountName: cdi-cronjob terminationGracePeriodSeconds: 0 ttlSecondsAfterFinished: 10 schedule: '* * * * *' successfulJobsHistoryLimit: 1 suspend: false status: active: - apiVersion: batch/v1 kind: Job name: cron-test-a6f5c71b-28351632 namespace: openshift-cnv resourceVersion: "8733558" uid: 961f3160-d661-4e53-b3d6-1d5059b66d63 lastScheduleTime: "2023-11-27T15:12:00Z" lastSuccessfulTime: "2023-11-25T15:45:04Z"
Version-Release number of selected component (if applicable):
Seen on 4.14, 4.15
How reproducible:
Seen twice, we don't usually reuse clusters from tier1 runs, but when we do - we hit this issue and it blocks tier2/3 runs.
Steps to Reproduce:
1. Run tier1 CDI OCS test suite (we don't know which test is causing it)
Actual results:
cronjob and job are not cleaned up
Expected results:
cronjob and job are cleaned up
Additional info:
It's a potential test blocker, because all the rest test jobs are being aborted when we see broken pods in openshift-cnv namespace
Git Pull Request: https://github.com/kubevirt/containerized-data-importer/pull/3106 closed
Git Pull Request: https://github.com/kubevirt/containerized-data-importer/pull/3120 closed
- is cloned by
-
CNV-38889 [4.14] In some flow, cronjob / job are not cleaned-up
- Closed
- links to
-
RHEA-2024:128638 OpenShift Virtualization 4.15.1 Images
- mentioned on