Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: CNV v4.15.1
Affects Version/s: CNV v4.15.0
Component/s: Storage Platform
Labels:
None

Story Points:
0.42
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
[QE] How to address?:
---
[QE] Why QE missed?:
---
Market:

Sprint:
Storage Core Sprint 249, Storage Core Sprint 250, Storage Core Sprint 251

Regression:
No

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Description of problem:

After tier1 CDI OCS test suite, we see that the cronjob and it's job are not cleaned up.
The pods are being recreated every *something seconds

[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron
cron-test-a6f5c71b-28351619-2prll                      0/1     Error     0              5s
cron-test-a6f5c71b-28351619-8wkgx                      0/1     Error     0              37s
cron-test-a6f5c71b-28351619-bkld5                      0/1     Error     0              26s
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron
cron-test-a6f5c71b-28351620-wzdgk                      0/1     Error     0              3s
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron
cron-test-a6f5c71b-28351620-j42l2                      0/1     Error     0              19s
cron-test-a6f5c71b-28351620-wzdgk                      0/1     Error     0              30s
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron
cron-test-a6f5c71b-28351623-h86lq                      0/1     ContainerCreating   0              0s
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron
cron-test-a6f5c71b-28351623-h86lq                      0/1     Error     0              2s
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get pods -n openshift-cnv | grep cron
cron-test-a6f5c71b-28351623-5gtlp                      0/1     Error               0              21s
cron-test-a6f5c71b-28351623-h86lq                      0/1     Error               0              32s
cron-test-a6f5c71b-28351623-svbfj                      0/1     ContainerCreating   0              0s

[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get dataimportcron --all-namespaces
NAMESPACE                            NAME                        FORMAT
openshift-virtualization-os-images   centos-7-image-cron         pvc
openshift-virtualization-os-images   centos-stream8-image-cron   pvc
openshift-virtualization-os-images   centos-stream9-image-cron   pvc
openshift-virtualization-os-images   fedora-image-cron           pvc
openshift-virtualization-os-images   rhel8-image-cron            pvc
openshift-virtualization-os-images   rhel9-image-cron            pvc

[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get cronjobs -n openshift-cnv
NAME                                 SCHEDULE       SUSPEND   ACTIVE   LAST SCHEDULE   AGE
centos-7-image-cron-1e248498         7 4/12 * * *   False     0        11h             3d
centos-stream8-image-cron-9290d33a   7 4/12 * * *   False     0        11h             3d
centos-stream9-image-cron-188404e1   7 4/12 * * *   False     0        11h             3d
cron-test-a6f5c71b                   * * * * *      False     0        36s             47h
fedora-image-cron-9a1f2246           7 4/12 * * *   False     0        11h             3d
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$
[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get jobs -n openshift-cnv
NAME                          COMPLETIONS   DURATION   AGE
cron-test-a6f5c71b-28351630   0/1           43s        43s

[cnv-qe-jenkins@cnv-qe-infra-01 ~]$ oc get cronjob -n openshift-cnv cron-test-a6f5c71b -oyaml
apiVersion: batch/v1
kind: CronJob
metadata:
  creationTimestamp: "2023-11-25T15:43:51Z"
  generation: 3
  labels:
    app.kubernetes.io/component: storage
    app.kubernetes.io/managed-by: cdi-controller
    app.kubernetes.io/part-of: hyperconverged-cluster
    app.kubernetes.io/version: 4.14.1
    cdi.kubevirt.io/dataImportCron: cdi-e2e-tests-dataimportcron-func-test-ch5s8.cron-test
  name: cron-test-a6f5c71b
  namespace: openshift-cnv
  ownerReferences:
  - apiVersion: cdi.kubevirt.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: CDI
    name: cdi-kubevirt-hyperconverged
    uid: 974943ef-6c2d-498b-96cc-a3dc9a7fdd74
  resourceVersion: "8733559"
  uid: db47a2f5-843d-445d-b374-592bdc589813
spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      backoffLimit: 2
      template:
        metadata:
          creationTimestamp: null
        spec:
          containers:
          - command:
            - /usr/bin/cdi-source-update-poller
            - -ns
            - cdi-e2e-tests-dataimportcron-func-test-ch5s8
            - -cron
            - cron-test
            - -url
            - docker://cdi-docker-registry-host.openshift-cnv/tinycoreqcow2
            env:
            - name: INSECURE_TLS
              value: "true"
            - name: http_proxy
            - name: https_proxy
            - name: no_proxy
            image: registry.redhat.io/container-native-virtualization/virt-cdi-importer-rhel9@sha256:75a9f754acba4cc158ebac58b161b70f964802a4ce9915cb20db413854af2830
            imagePullPolicy: IfNotPresent
            name: cdi-source-update-poller
            resources: {}
            securityContext:
              allowPrivilegeEscalation: false
              capabilities:
                drop:
                - ALL
              runAsNonRoot: true
              runAsUser: 107
              seccompProfile:
                type: RuntimeDefault
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
          dnsPolicy: ClusterFirst
          restartPolicy: Never
          schedulerName: default-scheduler
          securityContext: {}
          serviceAccount: cdi-cronjob
          serviceAccountName: cdi-cronjob
          terminationGracePeriodSeconds: 0
      ttlSecondsAfterFinished: 10
  schedule: '* * * * *'
  successfulJobsHistoryLimit: 1
  suspend: false
status:
  active:
  - apiVersion: batch/v1
    kind: Job
    name: cron-test-a6f5c71b-28351632
    namespace: openshift-cnv
    resourceVersion: "8733558"
    uid: 961f3160-d661-4e53-b3d6-1d5059b66d63
  lastScheduleTime: "2023-11-27T15:12:00Z"
  lastSuccessfulTime: "2023-11-25T15:45:04Z"

Version-Release number of selected component (if applicable):

Seen on 4.14, 4.15

How reproducible:

Seen twice, we don't usually reuse clusters from tier1 runs, but when we do - we hit this issue and it blocks tier2/3 runs.

Steps to Reproduce:

1. Run tier1 CDI OCS test suite (we don't know which test is causing it)

Actual results:

cronjob and job are not cleaned up

Expected results:

cronjob and job are cleaned up

Additional info:

It's a potential test blocker, because all the rest test jobs are being aborted when we see broken pods in openshift-cnv namespace

Git Pull Request: https://github.com/kubevirt/containerized-data-importer/pull/3106 closed

Git Pull Request: https://github.com/kubevirt/containerized-data-importer/pull/3120 closed

is cloned by

CNV-38889 [4.14] In some flow, cronjob / job are not cleaned-up

Closed

links to

RHEA-2024:128638 OpenShift Virtualization 4.15.1 Images

mentioned on

Merge request - Updated US source to: 714d6a5 Replace cron expression library with one used in kubernetes (#3127)

Merge request - Updated US source to: adc4aba Watch DIC-orphan cronjobs and cleanup them (#3106)

Assignee:: Arnon Gilboa

Reporter:: Jenia Peimer

QA Contact:: Jenia Peimer

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2023/11/28 10:22 AM

Updated:: 2024/04/01 5:33 AM

Resolved:: 2024/04/01 5:33 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates