Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-51055

nmstate-operator & Shim interface: Node stuck in NotReady after reboot

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Undefined Undefined
    • None
    • 4.18, 4.18.z
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Files: https://drive.google.com/drive/folders/1Drsm3pHLDo-n1oWAw9VsXetvoRb5w5qu?usp=drive_link

      Description of problem:

      When applying NodeNetworkConfigurationPolicy to create a Shim interface, it initially is configured correctly & everything works as expected. But as soon as the Node is rebooted, the Node is stuck in NotReady state, since configure-ovs.sh fails to recreate OVNKubernetes. 
      
      The NodeNetworkConfigurationPolicy is based on the ODF multus ceph-public-net example here: https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.17/html/planning_your_deployment/network-requirements_rhodf#multus-examples_rhodf
      
      This makes it currently impossible to have a stable multus deployment of ODF.

      Version-Release number of selected component (if applicable):

      CoreOS: 418.94.202502100215-0 / nmstate: kubernetes-nmstate-operator.4.18.0-202412040208

      How reproducible:

      I can reproduce it every time with OCP 4.18, OCP 4.17 & OCP 4.16. It works correctly with OCP 4.15. It is simple to reproduce: you just need a new NIC, apply the attached YAML on it & reboot the node.   

      Steps to Reproduce:

      1. Select a Node, e.g. worker-001
      2. Add a new NIC, e. g. ence400
      3. Apply the NMState YAML attached to this issue
      4. Reboot

      Actual results:

      At first, everything works. But after Node reboot, the Node remains in NotReady state: configure-ovs.sh script does not recreate OVNKubernetes.

      Expected results:

      Node should be Ready after reboot.

      Additional info:

      I can observe this issue since OCP 4.16    
      
      Please note, that with simpler NodeNetworkConfigurationPolicy I have no issues, e.g. by simply configuring the state or ipv4 configuration of a NIC. 

       

              bnemec@redhat.com Benjamin Nemec
              rh-ee-mgotin Manuel Gotin
              None
              None
              Ross Brattain Ross Brattain
              None
              IBM Confidential Group
              Dominik Werle, Florian Leber, Muhammad Adeel, Sherine Khoury
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: