Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-48348

Hotplug fails when migration mode is PostCopy

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • CNV v4.20.0
    • CNV v4.17.0, CNV v4.18.0
    • CNV Virtualization
    • None
    • Quality / Stability / Reliability
    • 13
    • False
    • Hide

      None

      Show
      None
    • False
    • CNV v4.20.0.rhel9-31
    • Release Notes
    • Hide
      When Live-Migration mode is "PostCopy", hot-plugging CPU or Memory resource fails.
      Show
      When Live-Migration mode is "PostCopy", hot-plugging CPU or Memory resource fails.
    • Bug Fix
    • Proposed
    • Yes
    • CNV Virt-Cluster Sprint 263, CNV Virt-Cluster Sprint 264, CNV Virt-Cluster Sprint 265, CNV Virt-Cluster Sprint 266, CNV Virt-Cluster Sprint 267, CNV Virt-Node Sprint 271, CNV Virt-Node Sprint 272, CNV Virt-Node Sprint 273
    • Important
    • None

      Description of problem:

      When live-migration mode is PostCopy hot-plugging cpu or memory resource fails

      Version-Release number of selected component (if applicable):

      cnv-4.17

      How reproducible:

      100%

      Steps to Reproduce:

      1.Create migration policy to force PostCopy migration mode
      spec:  
        allowAutoConverge: true
        allowPostCopy: true
        bandwidthPerMigration: 1Mi
        completionTimeoutPerGiB: 1
        selectors:
          virtualMachineInstanceSelector:
            post-copy-vm: 'true'
      
      2.Create and start VM
      3.Modify cpu or memory value on VM CR and wait for migration to start
      

      Actual results:

      Migration succeeds but hotplug fails (no additional cpu or memory added to guest)

      Expected results:

      Hotplug succeeds

      Additional info:

      This happens only on PostCopy mode. On PreCopy mode hotplug works OK.
      Regular migration on PostCopy mode succeeds
      
      
      VMI events
      Events:
        Type     Reason                                                                                                                                   Age                From                         Message
        ----     ------                                                                                                                                   ----               ----                         -------
        Normal   SuccessfulCreate                                                                                                                         79m                virtualmachine-controller    Created virtual machine pod virt-launcher-rhel-latest-post-copy-migration-vm-1726040077874r
        Normal   SuccessfulCreate                                                                                                                         79m                disruptionbudget-controller  Created PodDisruptionBudget kubevirt-disruption-budget-mdr92
        Normal   Created                                                                                                                                  79m                virt-handler                 VirtualMachineInstance defined.
        Normal   Started                                                                                                                                  79m                virt-handler                 VirtualMachineInstance started.
        Normal   SuccessfulCreate                                                                                                                         60m                workload-update-controller   Created Migration kubevirt-workload-update-zbbkc for automated workload update
        Normal   SuccessfulUpdate                                                                                                                         60m (x2 over 60m)  virtualmachine-controller    Expanded PodDisruptionBudget kubevirt-disruption-budget-mdr92
        Normal   Migrating                                                                                                                                60m                virt-handler                 VirtualMachineInstance is migrating.
        Normal   PreparingTarget                                                                                                                          60m                virt-handler                 Migration Target is listening at 10.128.2.53, on ports: 45095,35237,38283
        Normal   PreparingTarget                                                                                                                          60m (x2 over 60m)  virt-handler                 VirtualMachineInstance Migration Target Prepared.
        Warning  unknown error encountered sending command SyncVirtualMachineMemory: rpc error: code = DeadlineExceeded desc = context deadline exceeded  60m                virt-handler                 failed to update guest memory
        Normal   Migrated                                                                                                                                 53m                virt-handler                 The VirtualMachineInstance migrated to node virt-vk-417-fsc9x-worker-0-lqspf.
        Normal   Deleted                                                                                                                                  53m                virt-handler                 Signaled Deletion
        Normal   SuccessfulUpdate    
      
      
      Error logs from destination virt-launcher pod
      $ oc -n virt-migration-and-maintenance-test-post-copy-migration logs virt-launcher-rhel-latest-post-copy-migration-vm-172604007g87w6 | grep \"error\"
      {"component":"virt-launcher","level":"error","msg":"internal error: Unable to get session bus connection: Cannot spawn a message bus without a machine-id: Invalid machine ID in /var/lib/dbus/machine-id or /etc/machine-id","pos":"virGDBusGetSessionBus:126","subcomponent":"libvirt","thread":"40","timestamp":"2024-09-11T08:25:02.896000Z"}
      {"component":"virt-launcher","level":"error","msg":"internal error: Unable to get system bus connection: Could not connect: No such file or directory","pos":"virGDBusGetSystemBus:99","subcomponent":"libvirt","thread":"40","timestamp":"2024-09-11T08:25:02.896000Z"}
      {"component":"virt-launcher","level":"error","msg":"Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigrateFinish3Params)","pos":"virDomainObjBeginJobInternal:467","subcomponent":"libvirt","thread":"26","timestamp":"2024-09-11T08:25:43.566000Z"}
      {"component":"virt-launcher","level":"error","msg":"attaching virtio-mem device","pos":"manager.go:329","reason":"virError(Code=68, Domain=0, Message='Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigrateFinish3Params)')","timestamp":"2024-09-11T08:25:43.567882Z"}
      {"component":"virt-launcher","kind":"","level":"error","msg":"Failed update VMI guest memory","name":"rhel-latest-post-copy-migration-vm-1726040070-84991","namespace":"virt-migration-and-maintenance-test-post-copy-migration","pos":"server.go:734","reason":"virError(Code=68, Domain=0, Message='Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigrateFinish3Params)')","timestamp":"2024-09-11T08:25:43.568200Z","uid":"7fcd9148-72b3-4038-844c-2bf9337f94f5"}
      {"component":"virt-launcher","level":"error","msg":"Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigrateFinish3Params)","pos":"virDomainObjBeginJobInternal:467","subcomponent":"libvirt","thread":"91","timestamp":"2024-09-11T08:26:19.399000Z"}
      {"component":"virt-launcher","level":"error","msg":"Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigrateFinish3Params)","pos":"virDomainObjBeginJobInternal:467","subcomponent":"libvirt","thread":"29","timestamp":"2024-09-11T08:26:51.893000Z"}
      {"component":"virt-launcher","kind":"","level":"error","msg":"failed to sync guest time","name":"rhel-latest-post-copy-migration-vm-1726040070-84991","namespace":"virt-migration-and-maintenance-test-post-copy-migration","pos":"manager.go:373","timestamp":"2024-09-11T08:26:51.895069Z","uid":"7fcd9148-72b3-4038-844c-2bf9337f94f5"}
                                                          

              jelejosne Jed Lejosne
              vsibirsk Vasiliy Sibirskiy
              Sibo Wang Sibo Wang
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

                Created:
                Updated: