Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-72386

progressTimeout is looking at remaining data for network monitoring

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • CNV v4.22.0
    • None
    • CNV Virt-Node
    • None
    • Quality / Stability / Reliability
    • 0.42
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None

      Description of problem:

      From the discussions in CNV-71164, it was clarified (thanks jelejosne) that progressTimeout is not supposed to be a migration stall detector, but a network timeout detector that looks into data being moved at all between source and destination.

      In this case, it should not be using RemainingData statistic from libvirt, as that goes up and down according to dirty rate and network transfer. But it should use dataProcessed which is the amount of data moved over the network.

      If dataProcessed does not go up over time, it means there is network connectivity issue between source and destination.

      Using remainig for this will not work well, because that one can go up and down according to network speeds and dirty rates. One may compensate for another and fool the logic for a false disconnect.

      func (m *migrationMonitor) processInflightMigration(dom cli.VirDomain, stats *libvirt.DomainJobInfo) *inflightMigrationAborted {
              [...]
              if (m.progressWatermark == 0) || (m.remainingData < m.progressWatermark) {
                      m.lastProgressUpdate = now
              }       
      
      

      Version-Release number of selected component (if applicable):

      All

              sgott@redhat.com Stuart Gott
              rhn-support-gveitmic Germano Veit Michel
              Denys Shchedrivyi Denys Shchedrivyi
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: