Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Critical
Fix Version/s: 4.18.0
Affects Version/s: 4.17.z, 4.16.z, 4.18.0
Component/s: Cloud Compute / KubeVirt Provider
Labels:
- cee.next_proposed
- hackathon-4.18

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Critical
Regression:
None

Target Backport Versions:
None
Target Version:

4.18.z
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
Done
Release Note Type:
Release Note Not Required
Release Note Text:
N/A

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

This is a clone of issue ~~OCPBUGS-50562~~. The following is the description of the original issue:
—
Deployed OPENSHIFT VIRTUALIZATION + ODF (internal mode) + HyperShift, created a managed control plane and used VIRT as worker nodes.

We found when the node (kubevirt VM) restarts so quick pod never reaches Terminating and comes back on the same node, the result is all workloads with PVCs now fail to start, as the hotplug is not triggered again. The worker node VM cannot find the disk:

26s Warning FailedMount pod/mysql MountVolume.MountDevice failed for volume "pvc-4e777fa9-c1nd-4ea5-b244-07w4c3b8888b" : rpc error: code = Unknown desc = couldn't find device by serial id

How reproducible:
Steps to Reproduce:
1. Create a PV on the managed cluster, and attach to a pod
2. Restart the worker node where is the pod located in the OCP-V UI
3. We can see the corresponding virt-launcher-xxxx pod is restarted and the VM go back so quickly

Actual results:
The pod failed to start due to the missing storage
Couldn't find the storage information in the virtualmachineinstance and the corresponding virt-launcher-xxxx pod

Expected results:
The storage can be persistent after VM's rebooting

Reference:
https://issues.redhat.com/browse/OCPBUGS-44350
https://issues.redhat.com/browse/CNV-54334

Additional:
Tested with ODF csi driver, volumemode Filesystem + Accessmode RMO, we can see there was a daemonset pod csi-rbdplugin took the responsibility to reattach the volume when the node restarted and went back before the pod evicted. The same as the NFS csi driver.

So the kubevirt-csi also need a similar feature.

blocks

OCPBUGS-54631 [4.17] Hotplug volumes doesn't auto-reattachment to the same Node (kubevirt VM) when restarting so quick pod never reaches Terminating

Closed

OCPBUGS-54775 [4.18] Hotplug volumes doesn't auto-reattachment to the same Node (kubevirt VM) when restarting so quick pod never reaches Terminating

Closed

causes

OCPBUGS-56015 Nodepool Stays in pending state in HCP cluster

Closed

is blocked by

OCPBUGS-50562 [4.19] Hotplug volumes doesn't auto-reattachment to the same Node (kubevirt VM) when restarting so quick pod never reaches Terminating

Closed

is cloned by

OCPBUGS-54775 [4.18] Hotplug volumes doesn't auto-reattachment to the same Node (kubevirt VM) when restarting so quick pod never reaches Terminating

Closed

links to

openshift/hypershift#5988: OCPBUGS-54630: Sync RBAC for attaching volumes on VM level

openshift/kubevirt-csi-driver#55: [release-4.18] OCPBUGS-54630: Ensure volume stays attached through reboots

RHBA-2025:4019 OpenShift Container Platform 4.18.z bug fix update

(3 links to)

Assignee:: Alex Kalenyuk

Reporter:: OpenShift Prow Bot

QA Contact:: Liangquan Li

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2025/04/04 5:48 PM

Updated:: 2025/07/14 1:22 PM

Resolved:: 2025/04/23 12:09 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates