Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-139078

ip=auto doesn't wait for all stacks do be configured

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • rhel-9.2.0.z, rhel-9.4.z, rhel-9.6.z
    • NetworkManager
    • None
    • None
    • rhel-net-mgmt
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • Hide

      Definition of Done:

      Please mark each item below with ( / ) if completed or ( x ) if incomplete:

      ( ) The acceptance criteria defined below are met.

      Given a NIC with kernel argument ip=auto-wait in a dual-stack network,

      When nm-initrd-generator creates the connection profile during initramfs,

      Then:

      • NetworkManager waits for both IPv4 and IPv6 configuration attempts to complete (using a configurable timeout via the required-timeout property)
      • The timeout value is configurable and can be set to a value shorter than the default 20s to meet specific use case requirements

      ( ) Integration test case is available upstream.


      ( ) Code is reviewed and merged upstream.


      ( ) Preliminary testing is done.


      ( ) Upstream documentation is written in the upstream MR.


      ( ) Release notes text is written in the RHEL issue.


      ( ) A demo is recorded

      Show
      Definition of Done: Please mark each item below with ( / ) if completed or ( x ) if incomplete: ( ) The acceptance criteria defined below are met. Given a NIC with kernel argument ip=auto-wait in a dual-stack network, When nm-initrd-generator creates the connection profile during initramfs, Then: NetworkManager waits for both IPv4 and IPv6 configuration attempts to complete (using a configurable timeout via the required-timeout property) The timeout value is configurable and can be set to a value shorter than the default 20s to meet specific use case requirements ( ) Integration test case is available upstream. ( ) Code is reviewed and merged upstream. ( ) Preliminary testing is done. ( ) Upstream documentation is written in the upstream MR. ( ) Release notes text is written in the RHEL issue. ( ) A demo is recorded
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • All
    • None

      Context: Assisted-Installer tool allows users to install Openshift easily on their own infrastructure (BM, VMs, on premise or in a cloud).

      What were you trying to do that didn't work?

      In dual stack environment using ip=auto on the kernel argument only waits on one stack to be available. This leads to issues when trying to download files (RHCOS ignition file in our case) during the initramfs, because the right stack might to fetch the file might not be configured.

      What is the impact of this issue to you?

      A solution that we applied in the past to workaround ip=auto, was to configure explicitly only the NIC, and the stacks that we detect during the installation process. While this solution works well for day1 installations because we have all the details to decide how the network configuration should be, the day2 process (adding a node to an existing cluster) doesn't work that well as we don't get much details, so we leave ip=auto on such nodes.

      On top of that, we added support for iSCSI, that we need to explicitly configure because of the explicit configuration above (otherwise, the NIC dedicated of iSCSI volume is left unconfigured, and the volume cannot be mounted).

      All this combination led us to subtle bugs, where machines are left with un-configured network stack, and stuck during boot time.

      The main goal of this bug, is to review this complexity, and see if NetworkManager can help us to make things simpler.

      Please provide the package NVR for which the bug is seen:

      NetworkManager / nm-initrd-generator

      How reproducible is this bug?:

      Machine connected to a dual stack network configured with `ip=auto`, if DHCPv4 is a bit slow, then only IPv6 is configured.

      Steps to reproduce

      See above

      Expected results

      • ensure that all possible stacks are configured on a NIC (best effort, after a timeout)
      • do not fail if a NIC do not get any configuration (best effort, after a timeout)

      Actual results

      • with ip=auto only one stack might be configured due to timings.

      In thread slack , we discussed the usage of ip=dhcp,dhcp6 that would match the expected behavior, the only issue would be the default, and not configurable timeout of 20s that might be a bit long, and can impact users when rebooting their machines.

      After testing, it looks like ip=dhcp,dhcp6 expects at least one stack to be configured on a NIC otherwise we go in emergency mode. In some setups, only some of the NICs are be configured with DHCP (in Oracle Cloud Infrastructure for example, the primary NIC is configured with DHCP, and secondary ones must be configured statically).

              nm-team Network Management Team
              agentil@redhat.com Adrien Gentil
              Network Management Team Network Management Team
              Vladimir Benes Vladimir Benes
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: