Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-54630

[4.18] Hotplug volumes doesn't auto-reattachment to the same Node (kubevirt VM) when restarting so quick pod never reaches Terminating

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • None
    • None
    • None
    • Done
    • Release Note Not Required
    • N/A
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-50562. The following is the description of the original issue:

      Deployed OPENSHIFT VIRTUALIZATION + ODF (internal mode) + HyperShift, created a managed control plane and used VIRT as worker nodes.

      We found when the node (kubevirt VM) restarts so quick pod never reaches Terminating and comes back on the same node, the result is all workloads with PVCs now fail to start, as the hotplug is not triggered again. The worker node VM cannot find the disk:

      26s Warning FailedMount pod/mysql MountVolume.MountDevice failed for volume "pvc-4e777fa9-c1nd-4ea5-b244-07w4c3b8888b" : rpc error: code = Unknown desc = couldn't find device by serial id

      How reproducible:
      Steps to Reproduce:
      1. Create a PV on the managed cluster, and attach to a pod
      2. Restart the worker node where is the pod located in the OCP-V UI
      3. We can see the corresponding virt-launcher-xxxx pod is restarted and the VM go back so quickly

      Actual results:
      The pod failed to start due to the missing storage
      Couldn't find the storage information in the virtualmachineinstance and the corresponding virt-launcher-xxxx pod

      Expected results:
      The storage can be persistent after VM's rebooting

      Reference:
      https://issues.redhat.com/browse/OCPBUGS-44350
      https://issues.redhat.com/browse/CNV-54334

      Additional:
      Tested with ODF csi driver, volumemode Filesystem + Accessmode RMO, we can see there was a daemonset pod csi-rbdplugin took the responsibility to reattach the volume when the node restarted and went back before the pod evicted. The same as the NFS csi driver.

      So the kubevirt-csi also need a similar feature.

              akalenyu Alex Kalenyuk
              openshift-crt-jira-prow OpenShift Prow Bot
              None
              None
              Liangquan Li Liangquan Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: