Loading...

Type: Spike
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Story Points:
2
Epic Link:
Secure Data Wipe Post-VM Deletion
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Feature Link:
RHOSSTRAT-185 - NVME passthrough for cloud providers
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Goal

Describe existing Nova interfaces that can be used to implement an external service handling the cleanup.

Assumptions

NVMe disks are passed through to the VMs by passing through the NVMe controller PCI device from the hypevisor.
RHOSO 18.0 is deployed and PCI passtrhough is configured for the NVMe controller PCI devices
PCI in Placement feature is enabled in RHOSO. (In tech preview status as of 18.0-FR1)
The external cleanup tool has access to the hypervisor and the OpenStack APIs as admin
Only VM create and delete needs to be handled for now, no migrations, resize, evacuation etc.

Suggested workaround

High level steps:

Detect the creation and deletion of VMs using NVMe device(s) via Nova notifications
Reserve the NVMe PCI device via the Placement API when a VM is created allocating the device
Wipe the NVMe device after the VM is deleted and then unreserve the PCI device via the Placement API

Note that the doc below uses pure Nova config options for simplicity. These need to be translated to RHOSO 18.0 configuration.

Recap for PCI passthrough configuration

nova compute conf:

  [pci]
  device_spec = { "vendor_id":"2646", "product_id":"5013", "device_type": "type-PCI"}
  alias = { "name": "nvme-type-1", "vendor_id":"2646", "product_id":"5013", "device_type": "type-PCI"}

nova api conf

  [pci]
  alias = { "name": "nvme-type-1", "vendor_id":"2646", "product_id":"5013", "device_type": "type-PCI"}

nova flavor:

  $ openstack flavor show m1.nvme1 | grep properties
  | properties                 | pci_passthrough:alias='nvme-type-1:1' |

Recap enable PCI in Placement

Follow the documentation in https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#pci-tracking-in-placement

nova compute conf:

[pci]
report_in_placement = True

nova api conf:

[filter_scheduler]
pci_in_placement = True

Detect VM creation and deletion via Nova notifications

Configure Nova to emit notification to the message bus

nova compute conf:

  [oslo_messaging_notifications]
  driver = messagingv2
  transport_url = <rabbitmq address>
  notification_format = versioned   # or both if other tools are also using the notifications and cannot use the new versioned format

Listen on the message bus for notifications

Example python code to connect and listen for notifications

the instance.create.end notification is sent after the VM scheduled to a compute host (hypevisor) and nova allocated the requested resources in Placement API for the VM. After this nova will start the VM on the compute host.
the instance.delete.end notification is sent after the VM is stopped, and deleted from the hypervisor and nova removed the resource allocation of the VM from the Placement API.

For example and instance.delete.end will look like this:

{"message_id": "1323c4b9-bca0-453c-b700-c226769552d1", "publisher_id": "nova-compute:aio", "event_type": "instance.delete.end", "priority": "INFO", "payload": {"nova_object.name": "InstanceActionPayload", "nova_object.namespace": "nova", "nova_object.version": "1.8", "nova_object.data": {"fault": null, "request_id": "req-5847f682-e4dc-44ed-8473-71403424d114", "uuid": "a81880d0-e1f3-4195-8785-9078c899f69e", "user_id": "7e9f6361d07d41b8bd0d2a133c1d5d48", "tenant_id": "82cec4de18334e79b39916d53c3fdaab", "reservation_id": "r-9uplxpf3", "display_name": "vm1", "display_description": null, "host_name": "vm1", "host": "aio", "node": "aio", "os_type": null, "architecture": null, "availability_zone": "nova", "flavor": {"nova_object.name": "FlavorPayload", "nova_object.namespace": "nova", "nova_object.version": "1.4", "nova_object.data": {"flavorid": "9e4b2a6d-5239-4269-a86d-febeb6400505", "memory_mb": 2048, "vcpus": 1, "root_gb": 4, "ephemeral_gb": 0, "name": "m1.nvme1", "swap": 0, "rxtx_factor": 1.0, "vcpu_weight": 0, "disabled": false, "is_public": true, "extra_specs": {"pci_passthrough:alias": "nvme-type-1:1"}, "projects": null, "description": null}}, "image_uuid": "505d3021-b162-4ca2-a83c-f86637de2d31", "key_name": null, "kernel_id": "", "ramdisk_id": "", "created_at": "2024-11-13T13:34:05Z", "launched_at": "2024-11-13T13:34:18Z", "terminated_at": "2024-11-13T14:34:45Z", "deleted_at": "2024-11-13T14:34:48Z", "updated_at": "2024-11-13T14:34:46Z", "state": "deleted", "power_state": "pending", "task_state": null, "progress": 0, "ip_addresses": [], "block_devices": [], "metadata": {}, "locked": false, "auto_disk_config": "MANUAL", "action_initiator_user": "7e9f6361d07d41b8bd0d2a133c1d5d48", "action_initiator_project": "82cec4de18334e79b39916d53c3fdaab", "locked_reason": null}}, "timestamp": "2024-11-13 14:34:49.430346"}

The interesting bits are:

FlavorPayload with the extra_spec "pci_passthrough:alias": "nvme-type-1:1" from this the external tool can detect if a VM was using an NVMe device type (pci alias).
The field "node": "aio" tells the external tool which compute host the VM was running
The "uuid": "a81880d0-e1f3-4195-8785-9078c899f69e" tells which VM was deleted.

Reserve the PCI device in Placement

To prevent nova to re-assign a PCI device to the next VM before the cleanup can happen the external tool needs to reserve the PCI resource in Placement.

Based on the instance.create.end notification the external tool can detect if the VM uses a flavor that has PCI alias that matches an NVMe PCI device.
If so the external tool can look up the allocation for of VM in Placement. The Placement consumer uuid is the VM uuid from the notification. Based on the resource class the external tool can find which resource providers are representing NVMe PCI devices and allocated for the VM. The name of the resource provider encodes both the hostname of the nova compute host the VM is scheduled to and the PCI address of the NVMe device.
On each of these resource providers the external tool needs change the reserved value of the Placement resource inventory from 0 to 1.

Wipe the device at VM deletion and unreserve the device

The external tool can detect when a VM is deleted via the instance.delete.end notification. It can use its existing information about the devices reserved in Placement for this VM to know which devices on which host needs to be cleaned.

After the tool finished cleaning the device it needs to go back to Placement API and change the reserved value in the inventory from 1 to 0 to signal that the device can be assigned again to the next VM.

Dependencies

RHOSO 18.0.5 to have the workaround to support many PCI devices with PCI in Placement: ~~OSPRH-12962~~
RHOSO 18.0-FR3 is planned to graduate PCI in Placement from Tech preview to Supported: OSPRH-13106
RHOSO 18.0-FR3 is planned to support configuring Nova notification message bus via the standard OpenStackControlPlane CR interface: OSPRH-230

Not covered aspects

how to deploy the external service on top of RHOSO 18.0
what is the exact RHOSO 18.0 configuration procedure to enable PCI in Placement and Notifications. (only pure Nova config options are provided above)

References

Configuring PCI passthrough in Nova: https://docs.openstack.org/nova/latest/admin/pci-passthrough.html
Nova notification interface: https://docs.openstack.org/nova/latest/admin/notifications.html
Example python code consuming Nova notifications: https://github.com/gibizer/nova-notification-demo/blob/3e81258032efab02a721ca3f694cbfc8cf70b143/ws_forwarder.py#L45-L64
PCI in Placement: https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#pci-tracking-in-placement
Placement API reference: https://docs.openstack.org/api-ref/placement/

is blocked by

OSPRH-230 as a user I want to get notifications from the deployed nova cluster

Backlog

OSPRH-13106 Graduate Flavor based PCI in Placement feature to full support

In Progress

OSPRH-12962 Backport allocation candidate fixes to 18.0

Closed

relates to

OSPRH-14164 Updating reserved value of an inventory created by Nova is undefined behavior

Backlog

Details

Description

Goal

Assumptions

Suggested workaround

Recap for PCI passthrough configuration

Recap enable PCI in Placement

Detect VM creation and deletion via Nova notifications

Configure Nova to emit notification to the message bus

Listen on the message bus for notifications

Reserve the PCI device in Placement

Wipe the device at VM deletion and unreserve the device

Dependencies

Not covered aspects

References

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty