Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: Storage Ecosystem
Labels:
- cnv-4?
- cnvbugsm
- devel_ack+
- pm_ack+
- qa_ack?
- qe_test_coverage?

Activity Type:
Quality / Stability / Reliability
Story Points:
0.42
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
BZ Status:
NEW
BZ URL:
https://bugzilla.redhat.com/show_bug.cgi?id=2158591
Bugzilla Bug:
RHBZ: 2158591
Market:

Sprint:
Storage Core Sprint 231, Storage Core Sprint 232, Storage Core Sprint 233, Storage Core Sprint 234, Storage Core Sprint 235, Storage Core Sprint 236, Storage Core Sprint 237, Storage Core Sprint 239, Storage Core Sprint 240, Storage Core Sprint 241
Severity:
Moderate

Regression:
No

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

PX Impact Score:

Description of problem:

The VMs created using OpenShift virtualization are by default configured to pause if the storage returns an i/o error. This is defined by error_policy='stop' in libvirtd xml as below and this propagates to qemu as "werror=stop,rerror=stop":

~~~
<disk type='block' device='disk' model='virtio-non-transitional'>
<driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/> <<<
<source dev='/dev/rootdisk' index='2'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
<alias name='ua-rootdisk'/>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</disk>
~~~

While using Ceph backend, the device is mapped using the Kernel rbd module and the timeout is defined in osd_request_timeout and is by default 0 [1]. This means the i/o will wait forever or will never timeout. So the qemu will never get an EIO for the pending IOs and will not move the VM into paused.

Also, we cannot power down the VM in this state. The virt-launcher pod will end up in "Terminating" status and qemu-kvm in D state when we power down the VM.

~~~
[root@worker-0 ~]# ps aux|grep qemu-kvm
107 1112413 12.7 0.0 0 0 ? D 18:12 0:47 [qemu-kvm]
~~~

Force deleting the virt-launcher pod will remove the pod but the qemu-kvm process will be still there in the OCP node and the only way to get rid of it is to reboot the node.

Also, it is technically possible to modify the osd_request_timeout while mapping the device using -o osd_request_timeout=custom-timeout. And also, the ceph-csi does have the feature to pass this via mapOptions. However, [2] and [3] don't recommend setting osd_request_timeout. However, we do have IO timeout in other block storage like FC or iSCSI.

[1] https://github.com/torvalds/linux/blob/85c7000fda0029ec16569b1eec8fd3a8d026be73/include/linux/ceph/libceph.h#L78
[2] https://patchwork.kernel.org/project/ceph-devel/patch/1527132420-10740-1-git-send-email-dongsheng.yang@easystack.cn/
[3] https://github.com/ceph/ceph/pull/20792#pullrequestreview-102251868

Version-Release number of selected component (if applicable):

OpenShift Virtualization 4.11.1

How reproducible:

100%

Steps to Reproduce:

1. Block the communication between the Ceph storage and the worker node where the VM is running.
2. The VM will get hung with the network layer still responding to the ping requests. Check the virsh output and it will be still running.
3. Try shutting down the VM. The virt-launcher will get stuck in terminating.

Actual results:

Virtual Machines are not moving into "Paused" status when Ceph storage backend is unavailable

Expected results:

The customer who reported the issue expects the VM to go down or paused when the storage goes down. Since the VM is still pingable, it prevents users from building HA applications that use an election mechanism which depends on health reports inferred by network connectivity, even if they spread their workload in multiple AZs.

Additional info:

external trackers

Red Hat Customer Portal 03389792

Red Hat Issue Tracker CNV-23976

Assignee:: Natalie Gavrielov

Reporter:: Nijin Ashok

QA Contact:: Natalie Gavrielov

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2023/01/05 7:01 PM

Updated:: 2025/08/07 8:49 PM

Resolved:: 2025/08/07 6:07 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates