-
Bug
-
Resolution: Unresolved
-
Major
-
rhos-17.1.z
-
False
-
-
False
-
Committed
-
rhos-docs
-
None
-
-
-
Description of problem:
The default values set by THT on live_migration_permit_post_copy and live_migration_permit_auto_converge are in conflict.
Version-Release number of selected component (if applicable):
17.1
16.2
How reproducible:
Default behavior
Steps to Reproduce:
1. Deploy RHOSP
2. Check values of both parameters in any compute host:
~~~
$ sudo egrep "^[libvirt|^live_migration_permit" /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf
[libvirt]
live_migration_permit_post_copy=True
live_migration_permit_auto_converge=True
~~~
Actual results:
Both parameters are set to True by default:
~~~
$ egrep -nA9 "NovaLiveMigrationPermitPostCopy:$|NovaLiveMigrationPermitAutoConverge:$" /usr/share/openstack-tripleo-heat-templates/deployment/nova/nova-compute-container-puppet.yaml
347: NovaLiveMigrationPermitPostCopy:
348- description: >
349- If "True" activates the instance on the destination node before migration is complete,
350- and to set an upper bound on the memory that needs to be transferred. Post copy
351- gets enabled per default if the compute roles is not a realtime role or disabled
352- by this parameter.
353- default: true
354- type: boolean
355- tags:
356- - role_specific
357: NovaLiveMigrationPermitAutoConverge:
358- description: >
359- Defaults to "True" to slow down the instance CPU until the memory copy process is faster than
360- the instance's memory writes when the migration performance is slow and might not complete.
361- Auto converge will only be used if this flag is set to True and post copy is not permitted
362- or post copy is unavailable due to the version of libvirt and QEMU.
363- default: true
364- type: boolean
365- tags:
366- - role_specific
~~~
Note however, that auto_converge is only used if post copy is not permitted or unavailable, which is not the case in our default configuration.
Expected results:
Either post_copy or auto_converge should be enabled, not both at the same time. Based on the findings on the related BZ#2312196, I'm inclined to think that post_copy should be disabled by default, as auto_converge performs way better, especially on workloads that are under heavy memory pressure.
Additional info:
https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.live_migration_permit_auto_converge
https://github.com/openstack/nova/blob/8a24acd9240f2a2705ccd979577e0e2338a238ef/nova/virt/libvirt/driver.py#L1022-L1029
- external trackers