-
Bug
-
Resolution: Unresolved
-
Major
-
2025.2 (Flamingo), rhos-18.0.11
-
5
-
False
-
False
-
?
-
rhos-workloads-compute
-
None
-
-
Known Issue
-
Done
-
-
-
-
-
Rejected
-
Sprint 5 Quasar & Pulsar, Sprint 6 Quasar & Pulsar, Sprint 9 Quasar & Pulsar
-
3
-
Critical
To Reproduce Steps to reproduce the behavior:
When attempting to create a virtual instance with more than four NVMe disks using PCI passthrough, a system fault occurs.
The following flavor is being used:
$ openstack flavor show s4a.64x512.NVMEx8 +----------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 0 | | id | 2e900e83-382a-48d5-a4e2-b20ea4a6cd0c | | name | s4a.64x512.NVMEx8 | | os-flavor-access:is_public | True | | properties | aggregate_instance_extra_specs:type='amd48nvme', hw:cpu_policy='dedicated', hw:mem_page_size='large', hw:numa_nodes='2', pci_passthrough:alias='nvme:8' | | ram | 524288 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 64 | +----------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
With this flavor it functions flawlessly w/ or without hw:numa_nodes
$ openstack flavor show s4a.32x256.NVMEx4 +----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 0 | | id | 742a1af3-2eed-4a27-b8e6-21a383b57288 | | name | s4a.32x256.NVMEx4 | | os-flavor-access:is_public | True | | properties | aggregate_instance_extra_specs:type='amd48nvme', hw:cpu_policy='dedicated', hw:mem_page_size='large', pci_passthrough:alias='nvme:4' | | ram | 262144 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 32 | +----------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
Device Info:
$ oc get openstackversions.core.openstack.org NAME TARGET VERSION AVAILABLE VERSION DEPLOYED VERSION openstackcontrolplane 18.0.11-20250812.2 18.0.11-20250812.2 18.0.11-20250812.2
02:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 03:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 04:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 05:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 24:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 25:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 26:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 27:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 64:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 65:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 66:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 67:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 84:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) 85:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) c3:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) c4:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) c5:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) c6:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) e3:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) e4:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) e5:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21) e6:00.0 Non-Volatile memory controller: SK hynix PE81x0 U.2/3 NVMe Solid State Drive (rev 21)
I have unsuccessfully attempted to mitigate the issue using several extraconfig:
1st attempt:
$ oc get openstackcontrolplanes.core.openstack.org openstackcontrolplane -oyaml | yq -rC '.spec.placement.template.customServiceConfig' | sed 's/\\n/\n/g' [placement] randomize_allocation_candidates = True max_allocation_candidates = 100000 allocation_candidates_generation_strategy = breadth-first $ oc exec -c placement-api pod/placement-6484d89798-wflxw -- cat /etc/placement/placement.conf.d/custom.conf [placement] randomize_allocation_candidates = True max_allocation_candidates = 100000 allocation_candidates_generation_strategy = breadth-first
2nd attempt
$ oc get openstackcontrolplanes.core.openstack.org openstackcontrolplane -oyaml | yq -rC '.spec.placement.template.customServiceConfig' | sed 's/\\n/\n/g' [placement] max_allocation_candidates = -1 allocation_candidates_generation_strategy = depth-first $ oc exec -c placement-api pod/placement-7fbb5fcd-h58zf -- cat /etc/placement/placement.conf.d/custom.conf [placement] max_allocation_candidates = -1 allocation_candidates_generation_strategy = depth-first
3rd attempt
$ oc get openstackcontrolplanes.core.openstack.org openstackcontrolplane -oyaml | yq -rC '.spec.placement.template.customServiceConfig' | sed 's/\\n/\n/g' [placement] max_allocation_candidates = 50 allocation_candidates_generation_strategy = breadth-first $ oc exec -c placement-api pod/placement-6fdbdb464-5lcfp -- cat /etc/placement/placement.conf.d/custom.conf [placement] max_allocation_candidates = 50 allocation_candidates_generation_strategy = breadth-first
Bug impact
This impacts the platform release and customer on-boarding, both scheduled in the next few days.
- relates to
-
RHOSSTRAT-773 [NVME passthrough] Secure Data Wipe Post-VM Deletion
-
- Closed
-
- links to
- mentioned on
(1 mentioned on)