Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-72388

Re-evaluate live migration timeout completionTimeoutinGiB

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • CNV v4.22.0
    • None
    • CNV Virt-Node
    • None
    • Quality / Stability / Reliability
    • 0.42
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None

      Description of problem:

      The default completionTimeoutinGiB is 150s in CNV.
      
      For a VM with 50G that will mean 125m trying to migrate it. Thats way too much. Take for an example where post-copy is not enabled, it will try for a very long time, blocking the migration queue, and result in nothing. It should wait less and move on to the next VM if it cannot migrate, then try again later. If we get 2 VMs unable to converge and blocking the migration queue out of the same host, then nothing moves for 125m, even VMs that could migrate out will stay.
      
      For reference, RHV had this timeout at 64s per GiB (13 years ago). I'd think CNV can be more agressive as VMs are both larger and the network is faster these days. Fail earlier and move on. But I'd leave the final value to you, I don't think RHV values are very relevant these days, things have changed.
      
      Please note that reducing this should not eliminate the need for CNV-72387, which is about trying to find an optimal time to trigger post-copy based on the migration progress. Just lowering this to trigger post-copy earlier will not fix that problem, as the post-copy start should be based on the current migration context (speeds), not VM size.
      
      Please look into calibrating these values to match current hardware speeds.
      

              sgott@redhat.com Stuart Gott
              rhn-support-gveitmic Germano Veit Michel
              Denys Shchedrivyi Denys Shchedrivyi
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: