-
Bug
-
Resolution: Done-Errata
-
Major
-
None
-
4.13, 4.12, 4.14, 4.14.0, 4.15.0
Description of problems:
- Impossible to create a SriovNetworkNodePolicy - SriovNetworkNodeState reports inconsistent data from Nova metadata
Version-Release number of selected component (if applicable):
4.14
How reproducible:
Deploy a OCP 4.14 RC1 cluster with a worker that has 3 network interfaces: one for management and 2 for SR-IOV (Nvidia_mlx5_ConnectX-6), but attached to two different OpenStack Neutron networks.
Steps to Reproduce:
1. Deploy the workers with 2 additional interfaces for SR-IOV, each of them in their own Neutron network; they do not share the same network UUID. This is important for the rest of the bug description. They are created from the same physical NIC on the hypervisor. 2. Create a first SriovNetworkNodePolicy, which will work with no problem. However the second interface will also be assigned to the neutron network while it should not. 3. Try to create a second SriovNetworkNodePolicy for the other device that is attached to the other Neutorn network, and it's rejected by the webhook.
Actual results:
Nova metadata (needed to show PCI mapping with neutron UUIDs): sh-4.4# curl http://169.254.169.254/openstack/2018-08-27/meta_data.json |jq { "uuid": "7114a452-2e00-4ade-856a-42d4dc7c894f", "meta": { "Name": "1etsl9fpzrhocpnfv-7bhdj-worker", "openshiftClusterID": "1etsl9fpzrhocpnfv-7bhdj" }, "hostname": "1etsl9fpzrhocpnfv-7bhdj-worker-0-fnz68", "name": "1etsl9fpzrhocpnfv-7bhdj-worker-0-fnz68", "launch_index": 0, "availability_zone": "worker", "random_seed": "yX8XERoU+8TPtDEXnpVkw+yF/R/wyzdA1vmmWgld6DKhNCin+zQTJdi4FyKkT32wYDnaHi6r0j8v4Ja+24Eu1My6u7y8lis+GuJCj4MrtoQngN7xCOJsRvDAakrg0LTJzeLWSrEJwfDAuZ3Q1ZVUCmT4dsGH6Uqg76jtyzRgetZvD1J8ZI2TuStJ+9XzHfqSWygij/zMAj6KL9jne5SIqy21+397kxElBeK+oBdAc6dz6SBn991n8R0j57SF9xj80sT7Jqfxzg7rQoTCqvD/iz6m+AiAkJrcigUFZngKVQPdPmU3UBLltR0yIjsJN2qqtczW7Avn4Jk7THklS9PF9jmQbblV1CwHdN5IW8oFUWI6iSUcQG1uIxEavJEkj/UWZ77uFheDQOCM3Pv3DRZGnpQfTNrrgxD7sJ9HfMFO5ekRG4pbYuW9eYItasHoyAtiM338rIFC7p9ZvkTcUglmSB65Mh0dV7hOTnEX9Bxwcv9v2S2co/LaF1PxCizUMSYM/S2yRiGf2/mcolmN/E tRaDsNFxnUg9SYzdvyS6nUSMPkLXmnqIdN+/tNKYHB596guxADE6RTwFLGexatCls+bocWoSWwWZDNB6SAZFNiqPxezNZBUGpU3kCXmQCdkq3tIlFWUIhmSNccfJew16BTxZTC4vytGAUytggTnnbzaJg=", "project_id": "00552bf9217648d7a5714fbd25f92df2", "devices": [ { "vlan": 177, "vf_trusted": true, "type": "nic", "mac": "fa:16:3e:7a:53:f4", "bus": "pci", "address": "0000:04:00.0" }, { "vlan": 178, "vf_trusted": true, "type": "nic", "mac": "fa:16:3e:e6:9b:be", "bus": "pci", "address": "0000:05:00.0" } ] } sh-4.4# curl http://169.254.169.254/openstack/2018-08-27/network_data.json |jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1295 100 1295 0 0 1558 0 --:--:-- --:--:-- --:--:-- 1558 { "links": [ { "id": "tapa046a03c-49", "vif_id": "a046a03c-4996-4212-8f23-3b8f197b03b3", "type": "ovs", "mtu": 1500, "ethernet_mac_address": "fa:16:3e:9c:9f:89" }, { "id": "tapdc2a9cd3-f7", "vif_id": "dc2a9cd3-f7f5-40df-9d52-328f86e6011b", "type": "hw_veb", "mtu": 9000, "ethernet_mac_address": "fa:16:3e:7a:53:f4" }, { "id": "tapce5054e4-c6", "vif_id": "ce5054e4-c65a-4c28-843c-155ab8fed825", "type": "hw_veb", "mtu": 9000, "ethernet_mac_address": "fa:16:3e:e6:9b:be" } ], "networks": [ { "id": "network0", "type": "ipv4_dhcp", "link": "tapa046a03c-49", "network_id": "5765e37b-0a13-49d2-a598-537178ce254f" }, { "id": "network1", "type": "ipv4", "link": "tapdc2a9cd3-f7", "ip_address": "192.168.177.4", "netmask": "255.255.255.0", "routes": [ { "network": "0.0.0.0", "netmask": "0.0.0.0", "gateway": "192.168.177.1" } ], "network_id": "b3ba899a-e06c-49da-93c5-c992048390b2", "services": [] }, { "id": "network2", "type": "ipv4", "link": "tapce5054e4-c6", "ip_address": "192.168.178.107", "netmask": "255.255.255.0", "routes": [ { "network": "0.0.0.0", "netmask": "0.0.0.0", "gateway": "192.168.178.1" } ], "network_id": "a81317cb-aa3d-4675-99cf-aa049f964a3c", "services": [] } ], "services": [ { "type": "dns", "address": "10.19.42.41" }, { "type": "dns", "address": "10.11.5.19" }, { "type": "dns", "address": "10.2.32.1" } ] } Webhook: "message": "error when creating \"deleteme1.yaml\": admission webhook \"operator-webhook.sriovnetwork.openshift.io\" denied the request: no supported NIC is selected by the nicSelector in CR provider1", "reason": "no supported NIC is selected by the nicSelector in CR provider1", "code": 400 SriovNetworkNodeState: http://pastebin.test.redhat.com/1110009 As you can see, 0000:05:00.0 pcibus id is correctly mapped to a81317cb-aa3d-4675-99cf-aa049f964a3c. But 0000:04:00.0 device is not visible, instead we see 0000:06:00.0. From an OpenStack perspective, the network attachment is correct: $ openstack network list +--------------------------------------+-------------+--------------------------------------+ | ID | Name | Subnets | +--------------------------------------+-------------+--------------------------------------+ | 5765e37b-0a13-49d2-a598-537178ce254f | management | cd76d903-2aba-4a45-a454-a6a9403f1e6d | | a81317cb-aa3d-4675-99cf-aa049f964a3c | provider-2 | fd015f15-7165-49f1-808a-98894b647736 | | b3ba899a-e06c-49da-93c5-c992048390b2 | provider-1 | f0677cb3-df80-4e96-b6e0-4b1ce3c3fa24 | | df5a3086-da5d-4e30-a911-dee88d44f307 | lb-mgmt-net | 7cd24dad-9821-46f6-95f9-a900d4081dde | +--------------------------------------+-------------+--------------------------------------+$ openstack port list --server 7114a452-2e00-4ade-856a-42d4dc7c894f +--------------------------------------+----------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+ | ID | Name | MAC Address | Fixed IP Addresses | Status | +--------------------------------------+----------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+ | a046a03c-4996-4212-8f23-3b8f197b03b3 | 1etsl9fpzrhocpnfv-7bhdj-worker-0-fnz68-0 | fa:16:3e:9c:9f:89 | ip_address='192.168.0.67', subnet_id='cd76d903-2aba-4a45-a454-a6a9403f1e6d' | ACTIVE | | ce5054e4-c65a-4c28-843c-155ab8fed825 | 1etsl9fpzrhocpnfv-7bhdj-worker-0-fnz68-sriov | fa:16:3e:e6:9b:be | ip_address='192.168.178.107', subnet_id='fd015f15-7165-49f1-808a-98894b647736' | ACTIVE | | dc2a9cd3-f7f5-40df-9d52-328f86e6011b | 1etsl9fpzrhocpnfv-7bhdj-worker-0-fnz68-sriov | fa:16:3e:7a:53:f4 | ip_address='192.168.177.4', subnet_id='f0677cb3-df80-4e96-b6e0-4b1ce3c3fa24' | ACTIVE | +--------------------------------------+----------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+
Expected results:
The 0000:04:00.0 PCI device be seen and a new SriovNetworkNodePolicy could be created for this device on the given network.
- blocks
-
OCPBUGS-22279 openstack: PCI IDs not consistent with nova metadata and therefore SriovNetworkNodeState assigning wrong networkID to interface
- Closed
-
OCPBUGS-22709 openstack: PCI IDs not consistent with nova metadata and therefore SriovNetworkNodeState assigning wrong networkID to interface
- Closed
-
OCPBUGS-23077 openstack: PCI IDs not consistent with nova metadata and therefore SriovNetworkNodeState assigning wrong networkID to interface
- Closed
- is cloned by
-
OCPBUGS-22279 openstack: PCI IDs not consistent with nova metadata and therefore SriovNetworkNodeState assigning wrong networkID to interface
- Closed
-
OCPBUGS-22709 openstack: PCI IDs not consistent with nova metadata and therefore SriovNetworkNodeState assigning wrong networkID to interface
- Closed
-
OCPBUGS-23077 openstack: PCI IDs not consistent with nova metadata and therefore SriovNetworkNodeState assigning wrong networkID to interface
- Closed
- links to
-
RHBA-2023:6845 OpenShift Container Platform 4.13.z bug fix update