Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-43370

The device is wiped if the sdN name(non persistent) changes after a reboot

XMLWordPrintable

    • Important
    • None
    • 5
    • OCPEDGE Sprint 262
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      Previously, the `forceWipeDevicesAndDestroyAllData` field was based on the `LVMVolumeGroupNodeStatus` object and its tracking of the volume group. With this fix, the annotation `wiped.devices.lvms.openshift.io` is associated with the `LVMVolumeGroup` object. While removing this annotation can lead to unpredictable results, the `LVMVolumeGroupNodeStatus` object no longer holds important state information for volume group tracking and can be safely removed. This also fixes a case where a device name change after a reboot can lead to a re-wipe.
      Show
      Previously, the `forceWipeDevicesAndDestroyAllData` field was based on the `LVMVolumeGroupNodeStatus` object and its tracking of the volume group. With this fix, the annotation `wiped.devices.lvms.openshift.io` is associated with the `LVMVolumeGroup` object. While removing this annotation can lead to unpredictable results, the `LVMVolumeGroupNodeStatus` object no longer holds important state information for volume group tracking and can be safely removed. This also fixes a case where a device name change after a reboot can lead to a re-wipe.
    • Bug Fix
    • In Progress

      Description of problem:

      An LVMCluster resource was created with forceWipeDevicesAndDestroyAllData set to true on device "virtio-pci-0000:0a:00.0":

      # oc get lvmcluster my-lvmcluster -o yaml |yq '.spec'
      storage:
        deviceClasses:
          - default: true
            deviceSelector:
              forceWipeDevicesAndDestroyAllData: true
              paths:
                - /dev/disk/by-path/virtio-pci-0000:0a:00.0
            fstype: xfs
            name: vg1
      [root@openshift-worker-cygnus-0 ~]# ls -l /dev/disk/by-path/pci-0000:0a:00.0
      lrwxrwxrwx. 1 root root 9 Oct 15 14:07 /dev/disk/by-path/pci-0000:0a:00.0 -> ../../vda
      

      Created a PVC on the storage class, which created an LV on the VG:

      [root@openshift-worker-cygnus-0 ~]# lvs
        LV                                   VG  Attr       LSize Pool        Origin Data%  Meta%  Move Log Cpy%Sync Convert
        e008e4f9-17d6-4dc3-a03c-945529788134 vg1 Vwi-aotz-- 1.00g thin-pool-1        0.00
      

      Then to change the sd name, added a new disk to the node with a PCI address lower than that of the above device, so it will be detected first and named vda. Rebooted the node and now "0000:0a:00.0" is vdb:

      # ls -l /dev/disk/by-path/virtio-pci-0000:0a:00.0
      lrwxrwxrwx. 1 root root 9 Oct 15 14:17 /dev/disk/by-path/virtio-pci-0000:0a:00.0 -> ../../vdb

       

      However after the reboot, vgmanager detected it as new device and wiped the disk:

      {"level":"info","ts":"2024-10-15T14:17:19Z","msg":"device wiped successfully","controller":"lvmvolumegroup","controllerGroup":"lvm.topolvm.io","controllerKind":"LVMVolumeGroup","LVMVolumeGroup":{"name":"vg1","namespace":"openshift-storage"},"namespace":"openshift-storage","name":"vg1","reconcileID":"172a0d2c-3f65-418e-a2a2-3454c18709d0","deviceName":"/dev/vdb"}

       

      The previous LVs were lost and pod using the PVC failed to start:

      117s        Warning   FailedMapVolume                pod/ubi-for-perf                                                       MapVolume.MapPodDevice failed for volume "pvc-53a66e70-9dcc-46b0-b70b-db0e84378954" : rpc error: code = NotFound desc = failed to find LV: e008e4f9-17d6-4dc3-a03c-945529788134

       

      Version-Release number of selected component (if applicable):

      lvms-operator.v4.16.3

      Steps to Reproduce:

      1. Create a LVMCluster using persistent names like by-path or by-id . Also Set forceWipeDevicesAndDestroyAllData to True.

      2. Force a rename of the device.

      3. Reboot the node. The device will be wiped by the vgmanager on reboot.

      Actual results:

      The device is wiped if the sdN name(non persistent) changes after a reboot    

      Expected results:

      The non persistent name can change during the boot. Refer   Disadvantages of non-persistent naming attributes. vgmanager should not wipe the device if the /dev/sdN name is changed after a reboot.

      Additional info:

       

              sakbas@redhat.com Suleyman Akbas
              rhn-support-nashok Nijin Ashok
              Minal Pradeep Makwana Minal Pradeep Makwana
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: