Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Storage Platform
Labels:
- chaos

Activity Type:
Quality / Stability / Reliability
Story Points:
0.42
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Component Fix Version(s):
None
Market:

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

PX Impact Score:

Description of problem:

High IO for short duration(5min) on all worker nodes causes some VirtualMachineInstance to crash

Version-Release number of selected component (if applicable):

fence-agents-remediation.v0.6.0.yaml
kubevirt-hyperconverged-operator.v4.18.23.yaml
node-healthcheck-operator.v0.10.1.yaml

odf-operator.v4.18.14-rhodf.yaml
fence-agents-remediation.v0.6.0.yaml
mcg-operator.v4.18.14-rhodf.yaml
odf-csi-addons-operator.v4.18.14-rhodf.yaml
cephcsi-operator.v4.18.14-rhodf.yaml
recipe.v4.18.14-rhodf.yaml
ocs-operator.v4.18.14-rhodf.yaml
ocs-client-operator.v4.18.14-rhodf.yaml
odf-prometheus-operator.v4.18.14-rhodf.yaml
rook-ceph-operator.v4.18.14-rhodf.yaml
odf-dependencies.v4.18.14-rhodf.yaml
node-healthcheck-operator.v0.10.1.yaml

How reproducible:

run stress-ng with below values for 5 mins on all nodes

io-block-size: 4k
io-write-bytes: 2g

Steps to Reproduce:

1.run io hog test using krkn (https://github.com/krkn-chaos/krkn)
use below config for the io-hog scenario
duration: 300
workers: '' # leave it empty '' node cpu auto-detection
hog-type: io
image: quay.io/krkn-chaos/krkn-hog
namespace: default
io-block-size: 4k
io-write-bytes: 2g
io-target-pod-folder: /hog-data
# node-name: "worker-0" # Uncomment to target a specific node by name
io-target-pod-volume:
  name: node-volume
  hostPath:
    path: /root # a path writable by kubelet in the root filesystem of the node
node-selector: "node-role.kubernetes.io/worker="
number-of-nodes: ''
taints: [] #example ["node-role.kubernetes.io/master:NoSchedule"]    
~

2.
3.

Actual results:

openshift-kni-infra                    16m         Warning   Unhealthy                pod/keepalived-e10-h27-000-r660                                   Liveness probe failed: command timed out                                              
virt-clone-clones                      14m         Warning   Stopped                  virtualmachineinstance/clone-vm-0-108                             The VirtualMachineInstance crashed.

Expected results:

VirtualMachineInstance should able to withstand high IO usage for short duration,

Additional info:

below is iostat output from one of the node, IO operations were done on sda(sda4), the root disk for the node
avg-cpu:  %user   %nice %system %iowait  %steal   %idle                                                                                                                                                                                       
           3.50    0.00   86.68    0.08    0.00    9.74
Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util                                        
          
nvme0c0n1        0.00      0.00     0.00   0.00    0.00     0.00   18.50    146.00    17.50  48.61    0.03     7.89    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.80
nvme0n1          0.00      0.00     0.00   0.00    0.00     0.00   18.50    146.00     0.00   0.00    0.00     7.89    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.80
rbd0             0.00      0.00     0.00   0.00    0.00     0.00    3.00    523.00     0.00   0.00 1209.83   174.33    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    3.63  52.75
rbd1             0.00      0.00     0.00   0.00    0.00     0.00    3.00     12.00     0.00   0.00  679.50     4.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    2.04  83.70
rbd10            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd11            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd12            0.00      0.00     0.00   0.00    0.00     0.00    1.50    522.00     0.00   0.00 2699.33   348.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    4.05  40.00
rbd13            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd15            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd16            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd17            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd18            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd19            0.00      0.00     0.00   0.00    0.00     0.00    2.00     42.00     0.00   0.00  250.25    21.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.50  67.90
rbd2             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd20            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd21            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd22           15.00    524.00     0.00   0.00   39.03    34.93    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.59  63.75
rbd23            0.00      0.00     0.00   0.00    0.00     0.00    1.00     26.00     0.00   0.00 1369.50    26.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    1.37 100.00
rbd3             0.00      0.00     0.00   0.00    0.00     0.00    0.50    512.00     0.00   0.00 1848.00  1024.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.92 100.00
rbd4             0.00      0.00     0.00   0.00    0.00     0.00    0.50     64.00     0.00   0.00 3047.00   128.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    1.52  26.15
rbd5             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd6             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd7             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
rbd8            42.50   1564.00     0.00   0.00    9.94    36.80    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.42  36.05
rbd9             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sda              0.00      0.00     0.00   0.00    0.00     0.00 1948.50  34840.00    72.00   3.56    0.07    17.88    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.13  12.40
sdb              1.50      6.00     0.00   0.00    0.00     4.00  482.50   3730.00    12.00   2.43    0.54     7.73    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.26  55.50
sdc              0.50      2.00     0.00   0.00    0.00     4.00  603.00   4756.00    11.00   1.79    0.56     7.89    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.34  65.10

Though the  stress was done on sda, it utilization  is quite low compared to rdb ones which has high utilization and high wait time. The disks that are used for ceph are sdb and sdc

ceph health check(ceph -s) did not indicate any "slow ops", when stress-ng was run for 5 min.

Assignee:: Adam Litke

Reporter:: Yogananth Subramanian

QA Contact:: Natalie Gavrielov

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/12/26 12:52 PM

Updated:: 2026/01/08 2:51 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates