-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
Quality / Stability / Reliability
-
0.42
-
False
-
-
False
-
None
-
-
None
Description of problem:
From the discussions in CNV-71164, it was clarified (thanks jelejosne) that progressTimeout is not supposed to be a migration stall detector, but a network timeout detector that looks into data being moved at all between source and destination.
In this case, it should not be using RemainingData statistic from libvirt, as that goes up and down according to dirty rate and network transfer. But it should use dataProcessed which is the amount of data moved over the network.
If dataProcessed does not go up over time, it means there is network connectivity issue between source and destination.
Using remainig for this will not work well, because that one can go up and down according to network speeds and dirty rates. One may compensate for another and fool the logic for a false disconnect.
func (m *migrationMonitor) processInflightMigration(dom cli.VirDomain, stats *libvirt.DomainJobInfo) *inflightMigrationAborted {
[...]
if (m.progressWatermark == 0) || (m.remainingData < m.progressWatermark) {
m.lastProgressUpdate = now
}
Version-Release number of selected component (if applicable):
All