Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-71212

agent-based installation fails to bootstrap networking due to pre-network-manager-config.service timeout on bare metal with VLAN interfaces

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • Agent Sprint 282
    • 1
    • Done
    • Bug Fix
    • In baremetal installations with VLANs, it is possible for the agent-installer pre-network-manager-config service to timeout. This fix increased the timeout to handle these type of installations.
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-63386. The following is the description of the original issue:

      Description of problem:

      During agent-based OpenShift cluster installations on bare metal environments with VLAN interfaces, CoreOS fails to bootstrap networking automatically. The pre-network-manager-config.service times out during execution, preventing NetworkManager from applying configuration profiles. As a result, the node boots without any active network interfaces, and installation cannot proceed unless networking is manually configured through nmcli or nmtui via console access.

      Journal logs from affected nodes show:

      pre-network-manager-config.service: start operation timed out. Terminating.
      ls: cannot access '/tmp/tmp.../hostX/*.nmconnection': No such file or directory
      
      Manually increasing the timeout (TimeoutStartSec=300s) in the systemd unit file allows the service to complete successfully, which confirms that the service times out before completing its configuration tasks.
      This issue affects automation of agent-based installations and should be fixed permanently—either by increasing the default timeout value or by making it configurable.

      Impact:

      Automation of agent-based OpenShift cluster installation on bare metal nodes is blocked due to networking not being initialized automatically.

      Workaround:

      Patch the ignition or modify the pre-network-manager-config.service to increase TimeoutStartSec to 300 seconds, then reload and restart the service manually.

      Expected results:

      During agent-based installation on bare metal systems, the CoreOS node should automatically configure and bring up networking without any manual intervention.

      Additional info:

      Case: https://access.redhat.com/support/cases/04281998 

              bfournie@redhat.com Robert Fournier
              rhn-support-ksuthar Komal Suthar
              None
              None
              zhenying niu zhenying niu
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: