Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-15801

Deferred binding inhibits ironic configuration drives from having usable network metadata

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • None
    • openstack-ironic
    • 5
    • False
    • Hide

      None

      Show
      None
    • False
    • rhos-ops-day1day2-hardprov
    • None
    • HardProv Sprint 2, HardProv Sprint 3, HardProv Sprint 4, HardProv Sprint 5
    • 4
    • Important

      This is a clone of the nova bug, which is a similar problem, but slightly different effect.

      Upstream this is tracked as: https://bugs.launchpad.net/ironic/+bug/2106073

      The issue, in short, is you can request nova to create an "instance" with networks. In certain cases, such a spine/leaf cases, addressing is deferred. The addressing has to be deferred because neutron has not yet been able to consult the physical networks, because IP addressing is dependent upon that physical network context. i.e. some IP addresses might be different depending on the physical_network which is available, and that can't be determined until it is matched up.

      What is functionally happening, is the overall process of deploying a baremetal node that is the nova.virt.ironic driver takes the metadata from the metadata service and has an ISO mastered to submit to Ironic as part of the configuration drive.

      The sequence of events is roughly as follows:

      1. Claim physical node which as already been claimed in placement. (By setting /instance_uuid on the ironic node.)
      2. Post vif attachments to the Ironic node.
      3. In theory, attempt the deferred metadata generation which leads into configuration drive generation.
      4. Post the configuration drive to Ironic as part of the "deploy the node" action.

      Which seems to also point towards a slightly different end result in this case of a deferred addressing bind.

      The network_data.json we get on the configuration drive bind is functionally empty. The labels are present unlike the macro error in the virtual machine case, but the lists of values is quite literally, empty.

      This results in baremetal nodes deployed, however has no network configuration it can consult in the configuration drive for networking.

      The expectation is that it would have content in the network_data.json file, and that the data would be correct/valid.

      Part of the underlying issue is that port binding is controlled by Ironic. You can't "bind" a physical port in advance to enable neutron to have complete physical network insight. In a sense, this is a distinct bug with Ironic, because ironic is also explicitly charged with controlling the state and security context of the baremetal node, and an early binding of the baremetal node creates an inherent security flaw.

      If we fix the base port binding logic in ironic to do the minimum needful in Ironic (https://review.opendev.org/c/openstack/ironic/+/946378) which consists of a "here is enough context to do the thing but not physically attach, and nova metadata was actually refreshed post attachment, then this issue would be completely fix. That being said, the most likely path is to teach ironic to discover and correct the overall issue with the configuration drive contents as most ironic operators don't use the metadata service at all, they use configuration drives. That most likely path has been started in https://review.opendev.org/c/openstack/ironic/+/946677.

      The early port binding logic fix does also need to navigate issues around metalsmith, which itself was creating an early bind which would then fail in CI.

       

      Expected result:

      Usable metadata in the network_data.json which is in proper network_data.json file format.

       

      Impact:

      Operators doing spine/leaf deployments where network interfaces are created without ports or a network is used to attach the the deployment (as reported upstream) will result in a state where the metadata in the configuration drive is useless. While DHCP might be available and that may be enough to get a node online, it might not have information on any ports which are not DHCP enabled.

      Workaround:

      Only deploy nodes with pre-created ports with addressing enabled, but that is not suitable for all users as this is fundamentally an advanced model of use.

      -------

      This is the nova bug below, which was cloned.

      Description of problem:
      After bug #2081254 was fixed in RHOSP 17.1 it is possible to attach ports without IP addresses to VMs. Feature itself works as expected, but Nova doesn't create consistent network_data.json for this scenario: ipless ports are not added to 'links' list.

      The reason is that network_info for ipless ports contains empty list of subnets:
      {"id": "24e7183c-6e96-44cb-be67-b76f93dc8a6b", "address": "fa:16:3e:3a:a6:fa", "network": {"id": "720efe1b-e34c-4cc4-b9f8-6f186f6e0443", "bridge": "br-int", "label": "internal-2", "subnets": [], "meta": {"injected": false, "tenant_id": "3ff81dd8d02b450cb7b0da6197c33a93", "mtu": 1442, "physical_network": null, "tunneled": true}}, "type": "ovs", "details":

      {"port_filter": true, "connectivity": "l2"}

      , "devname": "tap24e7183c-6e", "ovs_interfaceid": "24e7183c-6e96-44cb-be67-b76f93dc8a6b", "qbh_params": null, "qbg_params": null, "active": true, "vnic_type": "normal", "profile": {}, "preserve_on_delete": true, "delegate_create": true, "meta": {}}

      As a result, netutils library skips adding such VIFs to 'links' list (empty 'subnets' -> VIF is skipped): https://github.com/openstack/nova/blob/e2ef2240b1e732b359d29457cc12abc7554fa286/nova/virt/netutils.py#L189

      This is inconsistent: 'links' list contains information about L2 connection and doesn't require IP address/subnet.

      Version-Release number of selected component (if applicable):
      RHOSP 17.1

      How reproducible:

      $ openstack port create --disable-port-security --no-security-group --network external-net --fixed-ip "subnet=external-sub" port1
      $ openstack port create --disable-port-security --no-security-group --network testnet --no-fixed-ip port2
      $ openstack server create --image testimage --flavor m1.nano --port port1 --port port2 testvm1

      Inside test VM:

      1. mount -o ro /dev/sr0 /mnt
      2. cat /mnt/openstack/2020-10-14/network_data.json
        Unknown macro: {"links"}

      Actual results:
      'links' list doesn't contain ipless ports

      Expected results:
      'links' list contains ipless ports

              jkreger@redhat.com Julia Kreger
              jkreger@redhat.com Julia Kreger
              rhos-dfg-hardprov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: