Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-3157

Migration gets stuck at pre-migrating status if source compute node is down but maintenance enabled

    • Moderate

      Description of problem:

      Currently nova rejects migration(resize) if the source compute node is down.

      ~~~
      (undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list
      ------------------------------------------------------------------------------------------------------------------------------+

      UUID Name Instance UUID Power State Provisioning State Maintenance

      ------------------------------------------------------------------------------------------------------------------------------+

      76fb384e-761d-4811-b03d-2ab981b8daa6 compute-0 1c039eaf-b2bf-4e52-843e-0a0221006dbf power off active False
      281613b5-c2ca-4741-bc6f-e7501b2fd6d8 compute-1 58730849-de47-4263-88f3-7e61b705c55b power on active False
      fc5681e1-6ae6-46fe-9bd6-c0b8cb43021e controller-0 b698b399-6ecd-4536-a7ed-e66554eedc9d power on active False
      e8efc933-3ea2-472b-8321-c8de52308e8d controller-1 d834fd79-3865-44a0-8a48-d5a7a5e1c720 power on active False
      7bdcaf64-0993-4878-95d5-1addb21925b2 controller-2 ea830026-efe6-4a4c-a702-4dcd4d2d7892 power on active False

      ------------------------------------------------------------------------------------------------------------------------------+
      (overcloud) [stack@undercloud-0 ~]$ openstack server list --long
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

      ID Name Status Task State Power State Networks Image Name Image ID Flavor Name Flavor ID Availability Zone Host Properties

      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      ...

      d11a09f4-cb44-48ad-9375-8e1be7d77bb2 testinstance SHUTOFF None Shutdown private=192.168.10.131 cirros-0.4.0-x86_64-disk.img_alt 3fb188fd-9902-4c41-a12e-5306edd65922     nova compute-0.redhat.local  

      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      (overcloud) [stack@undercloud-0 ~]$ openstack compute service list
      ---------------------------------------------------------------------------------------------------------------------------

      ID Binary Host Zone Status State Updated At

      ---------------------------------------------------------------------------------------------------------------------------
      ...

      956ece2d-be24-4618-8e8f-a82f616fc31b nova-compute compute-0.redhat.local nova enabled down 2021-07-25T07:16:11.000000

      ---------------------------------------------------------------------------------------------------------------------------
      (overcloud) [stack@undercloud-0 ~]$ openstack server migrate testinstance
      Service is unavailable at this time. (HTTP 409) (Request-ID: req-d76e2ba5-57eb-4ae9-8ce3-58dd76b64896)
      ~~~

      However this validation is bypassed if nova-compute on the source compute node has maintenance enabled,
      and this results in migration stuck in pre-migrating status.

      ~~~
      (overcloud) [stack@undercloud-0 ~]$ openstack compute service set --disable compute-0.redhat.local nova-compute
      (overcloud) [stack@undercloud-0 ~]$ openstack compute service list
      ----------------------------------------------------------------------------------------------------------------------------

      ID Binary Host Zone Status State Updated At

      ----------------------------------------------------------------------------------------------------------------------------
      ...

      956ece2d-be24-4618-8e8f-a82f616fc31b nova-compute compute-0.redhat.local nova disabled down 2021-07-25T07:17:05.000000

      ----------------------------------------------------------------------------------------------------------------------------
      (overcloud) [stack@undercloud-0 ~]$ openstack server migrate testinstance
      (overcloud) [stack@undercloud-0 ~]$
      (overcloud) [stack@undercloud-0 ~]$ nova migration-list --instance-uuid d11a09f4-cb44-48ad-9375-8e1be7d77bb2
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      Id UUID Source Node Dest Node Source Compute Dest Compute Dest Host Status Instance UUID Old Flavor New Flavor Created At Updated At Type

      -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      15 a7678b2e-4a4f-4fcc-a438-e057cc8796d6 compute-0.redhat.local compute-1.redhat.local compute-0.redhat.local compute-1.redhat.local 172.17.1.45 pre-migrating d11a09f4-cb44-48ad-9375-8e1be7d77bb2 5 5 2021-07-25T07:17:23.000000 2021-07-25T07:17:27.000000 migration

      -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      (overcloud) [stack@undercloud-0 ~]$ openstack server show testinstance
      --------------------------------------------------------------------------------------------------------------------------------------------------------+

      Field Value

      --------------------------------------------------------------------------------------------------------------------------------------------------------+

      OS-DCF:diskConfig MANUAL
      OS-EXT-AZ:availability_zone nova
      OS-EXT-SRV-ATTR:host compute-0.redhat.local
      OS-EXT-SRV-ATTR:hostname testinstance
      OS-EXT-SRV-ATTR:hypervisor_hostname compute-0.redhat.local
      OS-EXT-SRV-ATTR:instance_name instance-00000024
      OS-EXT-SRV-ATTR:kernel_id  
      OS-EXT-SRV-ATTR:launch_index 0
      OS-EXT-SRV-ATTR:ramdisk_id  
      OS-EXT-SRV-ATTR:reservation_id r-jth003rm
      OS-EXT-SRV-ATTR:root_device_name /dev/vda
      OS-EXT-SRV-ATTR:user_data None
      OS-EXT-STS:power_state Shutdown
      OS-EXT-STS:task_state resize_prep
      OS-EXT-STS:vm_state stopped
      OS-SRV-USG:launched_at 2021-07-14T07:31:30.000000
      OS-SRV-USG:terminated_at None
      accessIPv4  
      accessIPv6  
      addresses private=192.168.10.131
      config_drive  
      created 2021-07-14T07:31:22Z
      description None
      flavor disk='1', ephemeral='0', extra_specs.hw_rng:allowed='True', original_name='m1.nano', ram='128', swap='0', vcpus='1'
      hostId ede2468f5fe92029a8a42760bfe4a90f20f5c064e12ab911a8c80b22
      host_status MAINTENANCE
      id d11a09f4-cb44-48ad-9375-8e1be7d77bb2
      image cirros-0.4.0-x86_64-disk.img_alt (3fb188fd-9902-4c41-a12e-5306edd65922)
      key_name None
      locked False
      locked_reason None
      name testinstance
      progress 0
      project_id 942783ae248c4e9eb353a6e6b327bda5
      properties  
      security_groups name='default'
      server_groups []
      status RESIZE
      tags []
      trusted_image_certificates None
      updated 2021-07-25T07:17:27Z
      user_id 06b547a0af8f49fd8239c85ce5d9571b
      volumes_attached  

      --------------------------------------------------------------------------------------------------------------------------------------------------------+
      ~~~

      Version-Release number of selected component (if applicable):

      How reproducible:
      Always

      Steps to Reproduce:
      1. Create an instance
      2. Shutdown the compute node where the instance is started
      3. Enable maintenance of the nova-compute service on the source compute node
      4. Migrate the instance

      Actual results:
      Migration is accepted but gets stuck in pre-migrating status

      Expected results:
      Migration is rejected

      Additional info:

            [OSPRH-3157] Migration gets stuck at pre-migrating status if source compute node is down but maintenance enabled

            Errata Tool added a comment -

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (Release of components for RHOSO 18.0), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHEA-2024:5245

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Release of components for RHOSO 18.0), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2024:5245

            This was fixed upstream but has not been included in a previous release yet, so it needs to be tested.

            melanie witt added a comment - This was fixed upstream but has not been included in a previous release yet, so it needs to be tested.

            Migration becomes error status after the source compute node is started.

            ~~~
            (overcloud) [stack@undercloud-0 ~]$ nova migration-list --instance-uuid d11a09f4-cb44-48ad-9375-8e1be7d77bb2
            ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

            Id UUID Source Node Dest Node Source Compute Dest Compute Dest Host Status Instance UUID Old Flavor New Flavor Created At Updated At Type

            ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

            15 a7678b2e-4a4f-4fcc-a438-e057cc8796d6 compute-0.redhat.local compute-1.redhat.local compute-0.redhat.local compute-1.redhat.local 172.17.1.45 error d11a09f4-cb44-48ad-9375-8e1be7d77bb2 5 5 2021-07-25T07:17:23.000000 2021-07-25T07:25:39.000000 migration

            ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
            ~~~

            ~~~
            (overcloud) [stack@undercloud-0 ~]$ nova instance-action testinstance req-7b0486c6-1970-469c-b99e-f6b9c70ed1da
            ----------------------------------------------------------------------------------------------------------+

            Property Value

            ----------------------------------------------------------------------------------------------------------+

            action migrate
            events [ {'event': 'compute_resize_instance', | | | 'finish_time': '2021-07-25T07:25:39.000000', | | | 'host': 'compute-0.redhat.local', | | | 'hostId': 'ede2468f5fe92029a8a42760bfe4a90f20f5c064e12ab911a8c80b22', | | | 'result': 'Error', | | | 'start_time': '2021-07-25T07:25:37.000000', | | | 'traceback': ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/utils.py", line ' | | | '1372, in decorated_function | | | ' | | | ' return function(self, context, *args, **kwargs) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 219, in decorated_function | | | ' | | | " kwargs['instance'], e, sys.exc_info()) | | | " | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", ' | | | 'line 220, in __exit__ | | | ' | | | ' self.force_reraise() | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", ' | | | 'line 196, in force_reraise | | | ' | | | ' six.reraise(self.type_, self.value, self.tb) | | | ' | | | ' File "/usr/lib/python3.6/site-packages/six.py", line 675, in ' | | | 'reraise | | | ' | | | ' raise value | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 207, in decorated_function | | | ' | | | ' return function(self, context, *args, **kwargs) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 4886, in resize_instance | | | ' | | | ' self._revert_allocation(context, instance, migration) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", ' | | | 'line 220, in __exit__ | | | ' | | | ' self.force_reraise() | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", ' | | | 'line 196, in force_reraise | | | ' | | | ' six.reraise(self.type_, self.value, self.tb) | | | ' | | | ' File "/usr/lib/python3.6/site-packages/six.py", line 675, in ' | | | 'reraise | | | ' | | | ' raise value | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 4883, in resize_instance | | | ' | | | ' instance_type, clean_shutdown, request_spec) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 4905, in _resize_instance | | | ' | | | ' ' | | | 'instance.save(expected_task_state=task_states.RESIZE_PREP) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_versionedobjects/base.py", ' | | | 'line 210, in wrapper | | | ' | | | ' ctxt, self, fn.__name__, args, kwargs) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/conductor/rpcapi.py", ' | | | 'line 246, in object_action | | | ' | | | ' objmethod=objmethod, args=args, kwargs=kwargs) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", ' | | | 'line 181, in call | | | ' | | | ' transport_options=self.transport_options) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", ' | | | 'line 129, in _send | | | ' | | | ' transport_options=transport_options) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", ' | | | 'line 674, in send | | | ' | | | ' transport_options=transport_options) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", ' | | | 'line 664, in _send | | | ' | | | ' raise result | | | '}

            ,

              {'event': 'compute_prep_resize', | | | 'finish_time': '2021-07-25T07:17:27.000000', | | | 'host': 'compute-1.redhat.local', | | | 'hostId': '5549c43406167205ceb5ec64dcbade90d1fc7a9155c6dd0d221e4858', | | | 'result': 'Success', | | | 'start_time': '2021-07-25T07:17:27.000000', | | | 'traceback': None}

            ,

              {'event': 'cold_migrate', | | | 'finish_time': '2021-07-25T07:17:27.000000', | | | 'host': 'controller-0.redhat.local', | | | 'hostId': 'baad4005132a4fa159710e2fa08f75da18843630088dd053b28b292b', | | | 'result': 'Success', | | | 'start_time': '2021-07-25T07:17:22.000000', | | | 'traceback': None}

            ,

              {'event': 'conductor_migrate_server', | | | 'finish_time': '2021-07-25T07:17:27.000000', | | | 'host': 'controller-0.redhat.local', | | | 'hostId': 'baad4005132a4fa159710e2fa08f75da18843630088dd053b28b292b', | | | 'result': 'Success', | | | 'start_time': '2021-07-25T07:17:22.000000', | | | 'traceback': None}

            ]

            instance_uuid d11a09f4-cb44-48ad-9375-8e1be7d77bb2
            message Error
            project_id 942783ae248c4e9eb353a6e6b327bda5
            request_id req-7b0486c6-1970-469c-b99e-f6b9c70ed1da
            start_time 2021-07-25T07:17:20.000000
            updated_at 2021-07-25T07:25:39.000000
            user_id 06b547a0af8f49fd8239c85ce5d9571b

            ----------------------------------------------------------------------------------------------------------+
            ~~~

            Takashi Kajinami (Inactive) added a comment - Migration becomes error status after the source compute node is started. ~~~ (overcloud) [stack@undercloud-0 ~] $ nova migration-list --instance-uuid d11a09f4-cb44-48ad-9375-8e1be7d77bb2 --- ------------------------------------ ---------------------- ---------------------- ---------------------- ---------------------- ----------- ------ ------------------------------------ ---------- ---------- -------------------------- -------------------------- ----------+ Id UUID Source Node Dest Node Source Compute Dest Compute Dest Host Status Instance UUID Old Flavor New Flavor Created At Updated At Type --- ------------------------------------ ---------------------- ---------------------- ---------------------- ---------------------- ----------- ------ ------------------------------------ ---------- ---------- -------------------------- -------------------------- ----------+ 15 a7678b2e-4a4f-4fcc-a438-e057cc8796d6 compute-0.redhat.local compute-1.redhat.local compute-0.redhat.local compute-1.redhat.local 172.17.1.45 error d11a09f4-cb44-48ad-9375-8e1be7d77bb2 5 5 2021-07-25T07:17:23.000000 2021-07-25T07:25:39.000000 migration --- ------------------------------------ ---------------------- ---------------------- ---------------------- ---------------------- ----------- ------ ------------------------------------ ---------- ---------- -------------------------- -------------------------- ----------+ ~~~ ~~~ (overcloud) [stack@undercloud-0 ~] $ nova instance-action testinstance req-7b0486c6-1970-469c-b99e-f6b9c70ed1da -------------- --------------------------------------------------------------------------------------------+ Property Value -------------- --------------------------------------------------------------------------------------------+ action migrate events [ {'event': 'compute_resize_instance', | | | 'finish_time': '2021-07-25T07:25:39.000000', | | | 'host': 'compute-0.redhat.local', | | | 'hostId': 'ede2468f5fe92029a8a42760bfe4a90f20f5c064e12ab911a8c80b22', | | | 'result': 'Error', | | | 'start_time': '2021-07-25T07:25:37.000000', | | | 'traceback': ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/utils.py", line ' | | | '1372, in decorated_function | | | ' | | | ' return function(self, context, *args, **kwargs) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 219, in decorated_function | | | ' | | | " kwargs['instance'], e, sys.exc_info()) | | | " | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", ' | | | 'line 220, in __exit__ | | | ' | | | ' self.force_reraise() | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", ' | | | 'line 196, in force_reraise | | | ' | | | ' six.reraise(self.type_, self.value, self.tb) | | | ' | | | ' File "/usr/lib/python3.6/site-packages/six.py", line 675, in ' | | | 'reraise | | | ' | | | ' raise value | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 207, in decorated_function | | | ' | | | ' return function(self, context, *args, **kwargs) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 4886, in resize_instance | | | ' | | | ' self._revert_allocation(context, instance, migration) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", ' | | | 'line 220, in __exit__ | | | ' | | | ' self.force_reraise() | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", ' | | | 'line 196, in force_reraise | | | ' | | | ' six.reraise(self.type_, self.value, self.tb) | | | ' | | | ' File "/usr/lib/python3.6/site-packages/six.py", line 675, in ' | | | 'reraise | | | ' | | | ' raise value | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 4883, in resize_instance | | | ' | | | ' instance_type, clean_shutdown, request_spec) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/compute/manager.py", ' | | | 'line 4905, in _resize_instance | | | ' | | | ' ' | | | 'instance.save(expected_task_state=task_states.RESIZE_PREP) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_versionedobjects/base.py", ' | | | 'line 210, in wrapper | | | ' | | | ' ctxt, self, fn.__name__, args, kwargs) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/nova/conductor/rpcapi.py", ' | | | 'line 246, in object_action | | | ' | | | ' objmethod=objmethod, args=args, kwargs=kwargs) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", ' | | | 'line 181, in call | | | ' | | | ' transport_options=self.transport_options) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", ' | | | 'line 129, in _send | | | ' | | | ' transport_options=transport_options) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", ' | | | 'line 674, in send | | | ' | | | ' transport_options=transport_options) | | | ' | | | ' File ' | | | '"/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", ' | | | 'line 664, in _send | | | ' | | | ' raise result | | | '} ,   {'event': 'compute_prep_resize', | | | 'finish_time': '2021-07-25T07:17:27.000000', | | | 'host': 'compute-1.redhat.local', | | | 'hostId': '5549c43406167205ceb5ec64dcbade90d1fc7a9155c6dd0d221e4858', | | | 'result': 'Success', | | | 'start_time': '2021-07-25T07:17:27.000000', | | | 'traceback': None} ,   {'event': 'cold_migrate', | | | 'finish_time': '2021-07-25T07:17:27.000000', | | | 'host': 'controller-0.redhat.local', | | | 'hostId': 'baad4005132a4fa159710e2fa08f75da18843630088dd053b28b292b', | | | 'result': 'Success', | | | 'start_time': '2021-07-25T07:17:22.000000', | | | 'traceback': None} ,   {'event': 'conductor_migrate_server', | | | 'finish_time': '2021-07-25T07:17:27.000000', | | | 'host': 'controller-0.redhat.local', | | | 'hostId': 'baad4005132a4fa159710e2fa08f75da18843630088dd053b28b292b', | | | 'result': 'Success', | | | 'start_time': '2021-07-25T07:17:22.000000', | | | 'traceback': None} ] instance_uuid d11a09f4-cb44-48ad-9375-8e1be7d77bb2 message Error project_id 942783ae248c4e9eb353a6e6b327bda5 request_id req-7b0486c6-1970-469c-b99e-f6b9c70ed1da start_time 2021-07-25T07:17:20.000000 updated_at 2021-07-25T07:25:39.000000 user_id 06b547a0af8f49fd8239c85ce5d9571b -------------- --------------------------------------------------------------------------------------------+ ~~~

              Unassigned Unassigned
              jira-bugzilla-migration RH Bugzilla Integration
              rhos-dfg-compute
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: