Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-67059

virt-freeze and unfreeze command fails from time to time when workload deployed over gcfs (fusion)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • CNV v4.19.0
    • CNV Perf/Scale
    • None
    • Quality / Stability / Reliability
    • 0.42
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None

      test with Red Hat Enterprise Linux release 8.10 (Ootpa) 

      Description of problem:

      During OADP kubevirt backup with Fusion storage, some VMs failed with unfreeze command. reached hook timeout of 30seconds. Backup of 10 VMs , 1 -2 VMs failed with unfreeze commands.

       

      Version-Release number of selected component (if applicable):

      OCP 4.19.0 , CNV 4.19.1 , Fusion-Access 0.9.4

       

      How reproducible:

      deploy Fusion Access operator and use share LUN as storage. Create namesapce with 10 VMs.

      Steps to Reproduce:

      1. Run OADP kubevirt backup
      2. Monitor backup progress
      3. Get errors during the backup
      

      Actual results:

      1-2 VMs failed with unfreeze command

      Expected results:

      All VMs completed unfreeze command

      Additional info:

      VM spec: RHEL8.10, Disk1: 60GB , Disk2: 150GB, volumeMode: FileSystem

      Error from backup log:
      time="2025-08-05T08:52:37Z" level=info msg="stdout: " backup=openshift-adp/bkp-kubvirt-dm-10vms-fusion10-cloud33-10cc hookCommand="[/usr/bin/virt-freezer --unfreeze --name oadp-rhel8-2disks-vm-9 --namespace oadp-kubevirt-10vms-fusion10]" hookContainer=compute hookName="<from-annotation>" hookOnError=Fail hookPhase=post hookSource=annotation hookTimeout="{30s}" hookType=exec logSource="/remote-source/velero/app/pkg/podexec/pod_command_executor.go:180"
      time="2025-08-05T08:52:37Z" level=info msg="stderr: {\"component\":\"freezer\",\"level\":\"info\",\"msg\":\"Guest agent version is 6.2.0\",\"pos\":\"virt-freezer.go:114\",\"timestamp\":\"2025-08-05T08:52:31.989248Z\"}\n{\"component\":\"freezer\",\"level\":\"error\",\"msg\":\"Unfreezing VMI failed\",\"pos\":\"virt-freezer.go:146\",\"reason\":\"server error. command Unfreeze failed: \\\"LibvirtError(Code=86, Domain=10, Message='Guest agent is not responding: Guest agent not available for now')\\\"\",\"timestamp\":\"2025-08-05T08:52:36.997704Z\"}\n" backup=openshift-adp/bkp-kubvirt-dm-10vms-fusion10-cloud33-10cc hookCommand="[/usr/bin/virt-freezer --unfreeze --name oadp-rhel8-2disks-vm-9 --namespace oadp-kubevirt-10vms-fusion10]" hookContainer=compute hookName="<from-annotation>" hookOnError=Fail hookPhase=post hookSource=annotation hookTimeout="{30s}" hookType=exec logSource="/remote-source/velero/app/pkg/podexec/pod_command_executor.go:181"time="2025-08-05T08:52:37Z" level=error msg="Error executing hook" backup=openshift-adp/bkp-kubvirt-dm-10vms-fusion10-cloud33-10cc error="command terminated with exit code 1" hookPhase=post hookSource=annotation hookType=exec logSource="/remote-source/velero/app/internal/hook/item_hook_handler.go:239"

      unfreeze failed)

      1. systemctl status qemu-guest-agent
        ● qemu-guest-agent.service - QEMU Guest Agent
           Loaded: loaded (/usr/lib/systemd/system/qemu-guest-agent.service; enabled; vendor preset: enabled)
           Active: active (running) since Mon 2025-08-04 07:57:49 EDT; 20h ago
         Main PID: 693 (qemu-ga)
            Tasks: 2 (limit: 10433)
           Memory: 2.1M
           CGroup: /system.slice/qemu-guest-agent.service
                   └─693 /usr/bin/qemu-ga --method=virtio-serial --path=/dev/virtio-ports/org.qemu.guest_agent.0 --blacklist=guest-file-open,guest-file-close,guest-file-read,guest-file-write,guest-file-seek,guest-file-flush,guest-exec,guest-exec>Aug 04 07:57:49 oadp-rhel8-2disks-vm-9 systemd[1]: Started QEMU Guest Agent.
        Aug 05 04:50:40 oadp-rhel8-2disks-vm-9 qemu-ga[693]: info: guest-fsfreeze called
        Aug 05 04:50:40 oadp-rhel8-2disks-vm-9 qemu-ga[693]: info: executing fsfreeze hook with arg 'freeze'
        Aug 05 04:52:41 oadp-rhel8-2disks-vm-9 qemu-ga[693]: info: executing fsfreeze hook with arg 'thaw'

      2:01 minutes between the last 2 commands

      (unfreeze succeeded)
      Aug 05 05:34:09 oadp-rhel8-2disks-vm-9 qemu-ga[693]: info: guest-fsfreeze called
      Aug 05 05:34:09 oadp-rhel8-2disks-vm-9 qemu-ga[693]: info: executing fsfreeze hook with arg 'freeze'
      Aug 05 05:34:35 oadp-rhel8-2disks-vm-9 qemu-ga[693]: info: executing fsfreeze hook with arg 'thaw'

      26 seconds between the last 2 commands

        1. Screenshot From 2025-08-18 12-20-05.png
          101 kB
          David Vaanunu
        2. Screenshot From 2025-08-18 12-07-10.png
          105 kB
          David Vaanunu
        3. Screenshot From 2025-08-18 12-22-08.png
          169 kB
          David Vaanunu
        4. Screenshot From 2025-08-18 11-53-42.png
          272 kB
          David Vaanunu

              amastbau Amos Mastbaum
              dvaanunu@redhat.com David Vaanunu
              Nir Rozen Nir Rozen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: