Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-64754

OpenShift installation fails using balance-slb - stuck at RHCOS ignition process

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • rhel-9.6
    • rhel-9.2.0.z, rhel-9.3.0.z, rhel-9.4.z, rhel-9.5.z, rhel-9.6.z
    • dracut
    • rhel-sst-cs-bootloaders
    • 20
    • 22
    • 3
    • Dev ack
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • Approved Blocker
    • Hide

      Given a system administrator is deploying an OpenShift cluster using the assisted installer or agent-based installer with a network bond configured on two interfaces (eno1 and eno2) using balance-xor mode with xmit_hash_policy=vlan+srcmac and balance-slb: 1,

      When they initiate the cluster installation process,

      Then the OpenShift nodes should successfully complete the RHCOS ignition process without errors and the cluster should deploy successfully with the network bond operating in the specified configuration.

      Definition of Done:

      • The implementation meets the acceptance criteria
      • Integration tests are written  and pass
      • The code is part of a downstream build attached to an errata
      Show
      Given a system administrator is deploying an OpenShift cluster using the assisted installer or agent-based installer with a network bond configured on two interfaces (eno1 and eno2) using balance-xor mode with xmit_hash_policy=vlan+srcmac and balance-slb: 1, When they initiate the cluster installation process, Then the OpenShift nodes should successfully complete the RHCOS ignition process without errors and the cluster should deploy successfully with the network bond operating in the specified configuration. Definition of Done: The implementation meets the acceptance criteria Integration tests are written  and pass The code is part of a downstream build attached to an errata
    • Fail
    • Manual
    • None

      What were you trying to do that didn't work?

      Deployment of OpenShift cluster using mode balance-slb. Here is an example of the configuration used (see also attached):

      interfaces:
       - name: eno1
      description: Ethernet eno1
      type: ethernet
      state: up
      ipv4:
      enabled: false
       - name: eno2
      description: Ethernet eno1
      type: ethernet
      state: up
      ipv4:
      enabled: false
       - name: bond0
      type: bond
      state: up
      ipv4:
      address:
       - ip: 10.0.1.11
      prefix-length: 24
      enabled: true
      link-aggregation:
      mode: balance-xor
      options:
      xmit_hash_policy: vlan+srcmac
      balance-slb: 1
      port:
       - eno1
       - eno2
      dns-resolver:
      config:
      search:
       - gfontana.me
       - lab.gfontana.me
      server:
       - 10.0.1.2
      routes:
      config:
       - destination: 0.0.0.0/0
      next-hop-address: 10.0.1.1
      next-hop-interface: bond0
      

       

      Note: Issue only happens if "balance-slb: 1" is set.

      What is the impact of this issue to you?

      Cluster deployment using bond mode balance-slb. Tested with OCP 4.16.16 and 4.17 with Assisted Installer AND Agent Based installer.

      How reproducible is this bug?: Happens 100% of the time

      Steps to reproduce

      1. Using either Assisted installer or Agent Based installer deploy a cluster with a bond with two network interfaces. Bond mode needs to be as follows:

        link-aggregation:
          mode: balance-xor
          options:
            xmit_hash_policy: vlan+srcmac
            balance-slb: 1
          port:
          - eno1
          - eno2

      Expected results

      Cluster deployed successfully

      Actual results

      Node deployment gets stuck after first reboot, during ignition. Attached are some evidences. I noticed the following error message in the console: 

      `nft configuration for balance-slb failed`

       

        1. balance-slb-failed.png
          377 kB
          Giovanni Fontana
        2. image (19).png
          163 kB
          Giovanni Fontana
        3. nmstate-bond0-marge-balance-slb.yaml
          0.7 kB
          Giovanni Fontana

              pvalena@redhat.com Pavel Valena
              giofontana Giovanni Fontana
              dracut maint mailing list dracut maint mailing list
              Frantisek Sumsal Frantisek Sumsal
              Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

                Created:
                Updated: