-
Bug
-
Resolution: Unresolved
-
Major
-
rhos-18.0.3
-
None
Having the following Intel E810 NICs configured as SRIOV:
[root@compute-1 ~]# lshw -c network -businfo |grep ens1 pci@0000:3b:00.0 ens1f0 network Ethernet Controller E810-C for QSFP pci@0000:3b:00.1 ens1f1 network Ethernet Controller E810-C for QSFP pci@0000:3b:01.0 ens1f0v0 network Ethernet Adaptive Virtual Function pci@0000:3b:01.1 ens1f0v1 network Ethernet Adaptive Virtual Function pci@0000:3b:01.2 ens1f0v2 network Ethernet Adaptive Virtual Function pci@0000:3b:01.3 ens1f0v3 network Ethernet Adaptive Virtual Function pci@0000:3b:01.4 ens1f0v4 network Ethernet Adaptive Virtual Function pci@0000:3b:11.0 ens1f1v0 network Ethernet Adaptive Virtual Function pci@0000:3b:11.1 ens1f1v1 network Ethernet Adaptive Virtual Function pci@0000:3b:11.2 ens1f1v2 network Ethernet Adaptive Virtual Function pci@0000:3b:11.3 ens1f1v3 network Ethernet Adaptive Virtual Function pci@0000:3b:11.4 ens1f1v4 network Ethernet Adaptive Virtual Function
And the following PCI configuration for Nova:
[root@compute-1 ~]# cat /var/lib/openstack/config/nova/03-sriov-nova.conf
[pci]
device_spec = {"address": "0000:3b:00.0", "physical_network":"sriov1", "trusted":"true"}
device_spec = {"address": "0000:3b:00.1", "physical_network":"sriov2", "trusted":"true"}
device_spec = {"address": "0000:5f:00.0", "physical_network":"sriovmlx1", "trusted":"true"}
device_spec = {"address": "0000:5f:00.1", "physical_network":"sriovmlx2", "trusted":"true"}
We create the corresponding virtual network and port like this:
sh-5.1$ openstack network show sriov1 +---------------------------+--------------------------------------+ | Field | Value | +---------------------------+--------------------------------------+ | admin_state_up | UP | | availability_zone_hints | | | availability_zones | | | created_at | 2025-01-10T11:34:11Z | | description | | | dns_domain | | | id | e7cec3da-1df1-4e7a-b86c-4eefc2ee2516 | | ipv4_address_scope | None | | ipv6_address_scope | None | | is_default | None | | is_vlan_transparent | None | | l2_adjacency | True | | mtu | 9000 | | name | sriov1 | | port_security_enabled | False | | project_id | 76c9c771920442af9c0fbba91e73f1b7 | | provider:network_type | vlan | | provider:physical_network | sriov1 | | provider:segmentation_id | 177 | | qos_policy_id | None | | revision_number | 2 | | router:external | Internal | | segments | None | | shared | False | | status | ACTIVE | | subnets | 539c954f-d703-4c4a-b3a7-843671e2d95b | | tags | | | tenant_id | 76c9c771920442af9c0fbba91e73f1b7 | | updated_at | 2025-01-10T11:34:15Z | +---------------------------+--------------------------------------+ sh-5.1$ openstack port show sriov1-port1 +-------------------------+------------------------------------------------------------------------------------------------------------+ | Field | Value | +-------------------------+------------------------------------------------------------------------------------------------------------+ | admin_state_up | UP | | allowed_address_pairs | | | binding_host_id | | | binding_profile | capabilities='['rx', 'tx', 'sg', 'tso', 'gso', 'gro', 'rxvlan', 'txvlan', 'ntuple', 'rxhash', 'txudptnl']' | | binding_vif_details | | | binding_vif_type | unbound | | binding_vnic_type | direct | | created_at | 2025-01-10T11:34:19Z | | data_plane_status | None | | description | | | device_id | | | device_owner | | | device_profile | None | | dns_assignment | fqdn='host-10-10-40-130.openstackgate.local.', hostname='host-10-10-40-130', ip_address='10.10.40.130' | | dns_domain | | | dns_name | | | extra_dhcp_opts | | | fixed_ips | ip_address='10.10.40.130', subnet_id='539c954f-d703-4c4a-b3a7-843671e2d95b' | | id | 259aeb66-2510-40d5-a369-483e460c462d | | ip_allocation | immediate | | mac_address | fa:16:3e:31:63:dd | | name | sriov1-port1 | | network_id | e7cec3da-1df1-4e7a-b86c-4eefc2ee2516 | | numa_affinity_policy | None | | port_security_enabled | False | | project_id | 76c9c771920442af9c0fbba91e73f1b7 | | propagate_uplink_status | None | | qos_network_policy_id | None | | qos_policy_id | None | | resource_request | None | | revision_number | 21 | | security_group_ids | | | status | DOWN | | tags | | | trunk_details | None | | updated_at | 2025-01-10T18:07:15Z | +-------------------------+------------------------------------------------------------------------------------------------------------+
When we try to instantiate a VM like this:
sh-5.1$ openstack server create --flavor nfv_qe_base_flavor --nic port-id=$(openstack port show -cid -fvalue sriov1-port1) --image rhel-guest-image-8.4-1245-nfv3.x86_64.img --config-drive True instance3 --wait +-------------------------------------+----------------------------------------------------------------------------------+ | Field | Value | +-------------------------------------+----------------------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | compute-1.ctlplane.example.com | | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-1.ctlplane.example.com | | OS-EXT-SRV-ATTR:instance_name | instance-00000076 | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2025-01-10T18:21:35.000000 | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | sriov1=10.10.40.130 | | adminPass | PncGxCq36siR | | config_drive | True | | created | 2025-01-10T18:21:19Z | | flavor | nfv_qe_base_flavor (100) | | hostId | 6a559c3d46abfad54742e1947cb6c04ca1ea325c5f48ff5aa841439f | | id | f9f0740f-e10d-4309-918c-2d4e7d1c3528 | | image | rhel-guest-image-8.4-1245-nfv3.x86_64.img (e90b700c-f998-4ede-a6f0-cb7da357d34c) | | key_name | None | | name | instance3 | | progress | 0 | | project_id | 76c9c771920442af9c0fbba91e73f1b7 | | properties | | | security_groups | name='default' | | status | ACTIVE | | updated | 2025-01-10T18:21:35Z | | user_id | b9389acf93484098a46077e0cb7070f5 | | volumes_attached | | +-------------------------------------+----------------------------------------------------------------------------------+
The VM status is ACTIVE however the compute becomes unresponsive and needs a cold reboot to recover. Find attached a image (screenshot of virtual console) with the messages we see just before it hangs.