Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-50278

cdi-deployment pod encountering nil pointer errors when default virt storageclass changes

XMLWordPrintable

    • Quality / Stability / Reliability
    • 2
    • False
    • Hide

      None

      Show
      None
    • False
    • CNV v4.18.0.rhel9-108
    • Storage Core Sprint 263
    • Moderate
    • None

      Description of problem:

      cdi-deployment pod goes into crashloopbackoff status due to nil pointers. For example:
      
      {"level":"info","ts":"2024-10-25T18:50:54Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"dataimportcron-controller","object":{"name":"centos-7-image-cron","namespace":"openshift-virtualization-os-images"},"namespace":"openshift-virtualization-os-images","name":"centos-7-image-cron","reconcileID":"6bc87a0c-973d-4adc-9d35-7d7a2e4547ac"}
      panic: runtime error: invalid memory address or nil pointer dereference [recovered]
      	panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x179010a]
      
      goroutine 979 [running]:
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
      	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:111 +0x1e5
      panic({0x196dd00?, 0x2c48f10?})
      	/usr/lib/golang/src/runtime/panic.go:770 +0x132
      kubevirt.io/containerized-data-importer/pkg/controller.(*DataImportCronReconciler).handleSnapshot(0xc000638ab0, {0x1ec2bb8, 0xc001a3ac30}, 0xc00209e008, 0x0, 0xc0018fc780)
      	/remote-source/app/pkg/controller/dataimportcron-controller.go:692 +0x4a
      kubevirt.io/containerized-data-importer/pkg/controller.(*DataImportCronReconciler).handleCronFormat(0x1ea7c20?, {0x1ec2bb8?, 0xc001a3ac30?}, 0xc00080bb90?, 0x1ec2bb8?, {0xc001492970?, 0xc000f8a600?}, 0xc00080bb90?)
      	/remote-source/app/pkg/controller/dataimportcron-controller.go:685 +0x5d
      kubevirt.io/containerized-data-importer/pkg/controller.(*DataImportCronReconciler).update.func1()
      	/remote-source/app/pkg/controller/dataimportcron-controller.go:334 +0xc5
      kubevirt.io/containerized-data-importer/pkg/controller.(*DataImportCronReconciler).update(0xc000638ab0, {0x1ec2bb8, 0xc001a3ac30}, 0xc00209e008)
      	/remote-source/app/pkg/controller/dataimportcron-controller.go:345 +0x3b8
      kubevirt.io/containerized-data-importer/pkg/controller.(*DataImportCronReconciler).Reconcile(0xc000638ab0, {0x1ec2bb8, 0xc001a3ac30}, {{{0xc000832900?, 0x5?}, {0xc000f980a8?, 0xc00109fd10?}}})
      	/remote-source/app/pkg/controller/dataimportcron-controller.go:126 +0x1e7
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1ec7908?, {0x1ec2bb8?, 0xc001a3ac30?}, {{{0xc000832900?, 0xb?}, {0xc000f980a8?, 0x0?}}})
      	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114 +0xb7
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0000fac60, {0x1ec2bf0, 0xc00030f8b0}, {0x1a0c0c0, 0xc00188a020})
      	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311 +0x3bc
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0000fac60, {0x1ec2bf0, 0xc00030f8b0})
      	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:261 +0x1be
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
      	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222 +0x79
      created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 326
      	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:218 +0x486
      
      

      Version-Release number of selected component (if applicable):

      Oberserved in 4.99; reason for getting this by default on OCP 4.16 probably should also be explored
      
      

      How reproducible:

      Every time installing Ansible-Edge-Gitops
      
      

      Steps to Reproduce:

      1. Install Ansible-Edge-Gitops validated pattern (https://github.com/validatedpatterns/ansible-edge-gitops) 
      2. Observe delay in image imports
      3.
      

      Actual results:

      cdi-deployment crashloops for around an hour, delaying VM deployment
      
      

      Expected results:

      Relatively rapid deployment of source datavolume images
      

      Additional info:

      lengthy troubleshooting thread in https://redhat-internal.slack.com/archives/C068X44C8VB/p1729877078303419
      

              rhn-support-awels Alexander Wels
              martjack@redhat.com Martin Jackson
              Yan Du Yan Du
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: