Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-108735

Windows BSOD with nbd read and write delay

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • None
    • Critical
    • ZStream
    • rhel-virt-windows
    • 8
    • False
    • False
    • Hide

      None

      Show
      None
    • Yes
    • None
    • Regression Exception
    • Requested
    • virtio-win-prewhql-0.1-290
    • None
    • Unspecified Release Note Type - Unknown
    • Unspecified
    • Unspecified
    • Unspecified
    • other
    • Windows
    • None

      Description of problem:

      Outages of 2  worker nodes  hosting MON pods(2 out of 3 )in different ODF zones causes some of the  windows VMs to crash and transition to recovery mode post chaos after the network latency has been removed. Worker node outage was created by introducing  60 seconds of network latency.

       

      Version-Release number of selected component (if applicable):

      openshift-cnv                                      kubevirt-hyperconverged-operator.v4.18.8          OpenShift Virtualization           4.18.8                kubevirt-hyperconverged-operator.v4.18.3          Succeeded
      openshift-ovirt-infra                              node-healthcheck-operator.v0.9.0                  Node Health Check Operator         0.9.0                 node-healthcheck-operator.v0.8.2                  Succeeded
      

      How reproducible:

      1. label the worker nodes with MON pods with label chaos=odf 
      2. Introduced 60s network latency for 15mins

      Steps to Reproduce:

      1. label the worker nodes with MON pods with label chaos=odf 
      2.Run chaos command, to cause 60s network latency for 15mins
      podman run --rm -e LABEL_SELECTOR="chaos=odf" -e INSTANCE_COUNT=2  -e DURATION=900 -e TRAFFIC_TYPE=egress  -e  EGRESS='{latency: 60000ms}' -e KUBECONFIG=/tmp/config  -e KRKN_KUBE_CONFIG=/tmp/config -e DISTRIBUTION='openshift' --net=host -v /tmp/config:/tmp/config:Z  quay.io/krkn-chaos/krkn-hub:network-chaos 3.
      

      Actual results:

      17 out of 255 Windows VM were not accessible through ssh.Of the 17, some of them had blank screen and most were at windows recovery screen post chaos after the MON worker nodes have recovered.

      Expected results:

      All the Windows VMs should be accessible through SSH post chaos.

      Additional info:

       

              rhn-support-vrozenfe Vadim Rozenfeld
              ysubrama@redhat.com Yogananth Subramanian
              Vadim Rozenfeld
              Virt Windows SST Bugs Virt Windows SST Bugs
              Menghuan Li Menghuan Li
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: