Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-12945

Compute hangs when using SRIO VF and Intel E810

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • rhos-18.0.16
    • rhos-18.0.3
    • os-net-config
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • rhos-connectivity-nfv
    • None
    • NFV Automation 010
    • 1
    • Important

      Having the following Intel E810 NICs configured as SRIOV:

      [root@compute-1 ~]# lshw -c network -businfo |grep ens1
      pci@0000:3b:00.0  ens1f0     network        Ethernet Controller E810-C for QSFP
      pci@0000:3b:00.1  ens1f1     network        Ethernet Controller E810-C for QSFP
      pci@0000:3b:01.0  ens1f0v0   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:01.1  ens1f0v1   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:01.2  ens1f0v2   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:01.3  ens1f0v3   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:01.4  ens1f0v4   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:11.0  ens1f1v0   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:11.1  ens1f1v1   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:11.2  ens1f1v2   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:11.3  ens1f1v3   network        Ethernet Adaptive Virtual Function
      pci@0000:3b:11.4  ens1f1v4   network        Ethernet Adaptive Virtual Function
      

      And the following PCI configuration for Nova:

      [root@compute-1 ~]# cat /var/lib/openstack/config/nova/03-sriov-nova.conf 
      [pci]
      device_spec = {"address": "0000:3b:00.0", "physical_network":"sriov1", "trusted":"true"}
      device_spec = {"address": "0000:3b:00.1", "physical_network":"sriov2", "trusted":"true"}
      device_spec = {"address": "0000:5f:00.0", "physical_network":"sriovmlx1", "trusted":"true"}
      device_spec = {"address": "0000:5f:00.1", "physical_network":"sriovmlx2", "trusted":"true"}
      

      We create the corresponding virtual network and port like this:

      sh-5.1$ openstack network show sriov1
      +---------------------------+--------------------------------------+
      | Field                     | Value                                |
      +---------------------------+--------------------------------------+
      | admin_state_up            | UP                                   |
      | availability_zone_hints   |                                      |
      | availability_zones        |                                      |
      | created_at                | 2025-01-10T11:34:11Z                 |
      | description               |                                      |
      | dns_domain                |                                      |
      | id                        | e7cec3da-1df1-4e7a-b86c-4eefc2ee2516 |
      | ipv4_address_scope        | None                                 |
      | ipv6_address_scope        | None                                 |
      | is_default                | None                                 |
      | is_vlan_transparent       | None                                 |
      | l2_adjacency              | True                                 |
      | mtu                       | 9000                                 |
      | name                      | sriov1                               |
      | port_security_enabled     | False                                |
      | project_id                | 76c9c771920442af9c0fbba91e73f1b7     |
      | provider:network_type     | vlan                                 |
      | provider:physical_network | sriov1                               |
      | provider:segmentation_id  | 177                                  |
      | qos_policy_id             | None                                 |
      | revision_number           | 2                                    |
      | router:external           | Internal                             |
      | segments                  | None                                 |
      | shared                    | False                                |
      | status                    | ACTIVE                               |
      | subnets                   | 539c954f-d703-4c4a-b3a7-843671e2d95b |
      | tags                      |                                      |
      | tenant_id                 | 76c9c771920442af9c0fbba91e73f1b7     |
      | updated_at                | 2025-01-10T11:34:15Z                 |
      +---------------------------+--------------------------------------+
      
      sh-5.1$ openstack port show sriov1-port1
      +-------------------------+------------------------------------------------------------------------------------------------------------+
      | Field                   | Value                                                                                                      |
      +-------------------------+------------------------------------------------------------------------------------------------------------+
      | admin_state_up          | UP                                                                                                         |
      | allowed_address_pairs   |                                                                                                            |
      | binding_host_id         |                                                                                                            |
      | binding_profile         | capabilities='['rx', 'tx', 'sg', 'tso', 'gso', 'gro', 'rxvlan', 'txvlan', 'ntuple', 'rxhash', 'txudptnl']' |
      | binding_vif_details     |                                                                                                            |
      | binding_vif_type        | unbound                                                                                                    |
      | binding_vnic_type       | direct                                                                                                     |
      | created_at              | 2025-01-10T11:34:19Z                                                                                       |
      | data_plane_status       | None                                                                                                       |
      | description             |                                                                                                            |
      | device_id               |                                                                                                            |
      | device_owner            |                                                                                                            |
      | device_profile          | None                                                                                                       |
      | dns_assignment          | fqdn='host-10-10-40-130.openstackgate.local.', hostname='host-10-10-40-130', ip_address='10.10.40.130'     |
      | dns_domain              |                                                                                                            |
      | dns_name                |                                                                                                            |
      | extra_dhcp_opts         |                                                                                                            |
      | fixed_ips               | ip_address='10.10.40.130', subnet_id='539c954f-d703-4c4a-b3a7-843671e2d95b'                                |
      | id                      | 259aeb66-2510-40d5-a369-483e460c462d                                                                       |
      | ip_allocation           | immediate                                                                                                  |
      | mac_address             | fa:16:3e:31:63:dd                                                                                          |
      | name                    | sriov1-port1                                                                                               |
      | network_id              | e7cec3da-1df1-4e7a-b86c-4eefc2ee2516                                                                       |
      | numa_affinity_policy    | None                                                                                                       |
      | port_security_enabled   | False                                                                                                      |
      | project_id              | 76c9c771920442af9c0fbba91e73f1b7                                                                           |
      | propagate_uplink_status | None                                                                                                       |
      | qos_network_policy_id   | None                                                                                                       |
      | qos_policy_id           | None                                                                                                       |
      | resource_request        | None                                                                                                       |
      | revision_number         | 21                                                                                                         |
      | security_group_ids      |                                                                                                            |
      | status                  | DOWN                                                                                                       |
      | tags                    |                                                                                                            |
      | trunk_details           | None                                                                                                       |
      | updated_at              | 2025-01-10T18:07:15Z                                                                                       |
      +-------------------------+------------------------------------------------------------------------------------------------------------+
      

      When we try to instantiate a VM like this:

      sh-5.1$ openstack server create --flavor nfv_qe_base_flavor --nic port-id=$(openstack port show -cid -fvalue sriov1-port1) --image rhel-guest-image-8.4-1245-nfv3.x86_64.img --config-drive True instance3 --wait
      
      +-------------------------------------+----------------------------------------------------------------------------------+
      | Field                               | Value                                                                            |
      +-------------------------------------+----------------------------------------------------------------------------------+
      | OS-DCF:diskConfig                   | MANUAL                                                                           |
      | OS-EXT-AZ:availability_zone         | nova                                                                             |
      | OS-EXT-SRV-ATTR:host                | compute-1.ctlplane.example.com                                                   |
      | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-1.ctlplane.example.com                                                   |
      | OS-EXT-SRV-ATTR:instance_name       | instance-00000076                                                                |
      | OS-EXT-STS:power_state              | Running                                                                          |
      | OS-EXT-STS:task_state               | None                                                                             |
      | OS-EXT-STS:vm_state                 | active                                                                           |
      | OS-SRV-USG:launched_at              | 2025-01-10T18:21:35.000000                                                       |
      | OS-SRV-USG:terminated_at            | None                                                                             |
      | accessIPv4                          |                                                                                  |
      | accessIPv6                          |                                                                                  |
      | addresses                           | sriov1=10.10.40.130                                                              |
      | adminPass                           | PncGxCq36siR                                                                     |
      | config_drive                        | True                                                                             |
      | created                             | 2025-01-10T18:21:19Z                                                             |
      | flavor                              | nfv_qe_base_flavor (100)                                                         |
      | hostId                              | 6a559c3d46abfad54742e1947cb6c04ca1ea325c5f48ff5aa841439f                         |
      | id                                  | f9f0740f-e10d-4309-918c-2d4e7d1c3528                                             |
      | image                               | rhel-guest-image-8.4-1245-nfv3.x86_64.img (e90b700c-f998-4ede-a6f0-cb7da357d34c) |
      | key_name                            | None                                                                             |
      | name                                | instance3                                                                        |
      | progress                            | 0                                                                                |
      | project_id                          | 76c9c771920442af9c0fbba91e73f1b7                                                 |
      | properties                          |                                                                                  |
      | security_groups                     | name='default'                                                                   |
      | status                              | ACTIVE                                                                           |
      | updated                             | 2025-01-10T18:21:35Z                                                             |
      | user_id                             | b9389acf93484098a46077e0cb7070f5                                                 |
      | volumes_attached                    |                                                                                  |
      +-------------------------------------+----------------------------------------------------------------------------------+
      

      The VM status is ACTIVE however the compute becomes unresponsive and needs a cold reboot to recover. Find attached a image (screenshot of virtual console) with the messages we see just before it hangs.

              jpalanis@redhat.com Jaganathan Palanisamy
              rdiazcam@redhat.com Ricardo Diaz Campos
              rhos-dfg-nfv
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: