Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-42031

Failed to Revert Bond br-ex Mode

XMLWordPrintable

    • Important
    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      * When the bond mode in the `NetworkNodeConfigurationPolicy` is changed from `balance-rr` to `active-backup` on kernel bonds that are attached to the `br-ex` interface, the change might fail on arbitrary nodes. As a workaround, create a `NetworkNodeConfigurationPolicy` object without specifying the bond port configuration. (link:https://issues.redhat.com/browse/OCPBUGS-42031[*OCPBUGS-42031*])
      Show
      * When the bond mode in the `NetworkNodeConfigurationPolicy` is changed from `balance-rr` to `active-backup` on kernel bonds that are attached to the `br-ex` interface, the change might fail on arbitrary nodes. As a workaround, create a `NetworkNodeConfigurationPolicy` object without specifying the bond port configuration. (link: https://issues.redhat.com/browse/OCPBUGS-42031 [* OCPBUGS-42031 *])
    • Known Issue
    • Done

      Description of problem:
      The configuration failed to revert on one worker, while the other applied it successfully.

      Version-Release number of selected component (if applicable):
      kubernetes-nmstate-operator.4.17.0-202409161407 

      How reproducible:
      100%

      Steps to Reproduce:
      1.

      1. Change bond mode to balance-rr
      2. Run traffic/failover tests
      3. Revert bond mode back to active-backup

      Actual results:
      works fine without the port configuration:

      $ oc describe nncp -A
      Name:         changebondmode
      Namespace:    
      Labels:       <none>
      Annotations:  nmstate.io/webhook-mutating-timestamp: 1726368718988958883
      API Version:  nmstate.io/v1
      Kind:         NodeNetworkConfigurationPolicy
      Metadata:
        Creation Timestamp:  2024-09-15T02:51:58Z
        Generation:          1
        Resource Version:    776290
        UID:                 47fcf71b-9b75-416b-8836-dd55926feda4
      Spec:
        Desired State:
          Interfaces:
            Link - Aggregation:
              Mode:  active-backup
              Options:
                fail_over_mac:  none
              Port:
                ens1f0v0
                ens1f1v0
            Name:   bond0
            State:  up
            Type:   bond
        Node Selector:
          node-role.kubernetes.io/workercnf:  
      Status:
        Conditions:
          Last Heartbeat Time:               2024-09-15T02:54:14Z
          Last Transition Time:              2024-09-15T02:54:14Z
          Reason:                            FailedToConfigure
          Status:                            False
          Type:                              Available
          Last Heartbeat Time:               2024-09-15T02:54:14Z
          Last Transition Time:              2024-09-15T02:54:14Z
          Message:                           1/2 nodes failed to configure
          Reason:                            FailedToConfigure
          Status:                            True
          Type:                              Degraded
          Last Heartbeat Time:               2024-09-15T02:54:14Z
          Last Transition Time:              2024-09-15T02:54:14Z
          Reason:                            ConfigurationProgressing
          Status:                            False
          Type:                              Progressing
        Last Unavailable Node Count Update:  2024-09-15T02:54:14Z
      Events:
        Type     Reason           Age    From                      Message
        ----     ------           ----   ----                      -------
        Warning  ReconcileFailed  8m19s  worker-1.nmstate-handler  error reconciling NodeNetworkConfigurationPolicy on node worker-1 at desired state apply: "",
       rolling back desired state configuration: failed runnig probes after network changes: failed runnig probe 'ping' with after network reconfiguration -> currentState: hostname:
      .......
        external_ids:
          hostname: worker-1
          ovn-enable-lflow-cache: 'true'
          ovn-encap-ip: 10.46.77.3
          ovn-encap-type: geneve
          ovn-is-interconn: 'true'
          ovn-memlimit-lflow-cache-kb: '1048576'
          ovn-monitor-all: 'true'
          ovn-ofctrl-wait-before-clear: '0'
          ovn-openflow-probe-interval: '180'
          ovn-remote: unix:/var/run/ovn/ovnsb_db.sock
          ovn-remote-probe-interval: '180000'
          ovn-set-local-ip: 'true'
          rundir: /var/run/openvswitch
          system-id: 4f58abc6-2246-4606-82d8-3d574d99cf55
        other_config:
          bundle-idle-timeout: '180'
          ovn-chassis-idx-4f58abc6-2246-4606-82d8-3d574d99cf55: ''
          vlan-limit: '0'
      ovn:
        bridge-mappings:
        - localnet: physnet
          bridge: br-ex
      
      : context deadline exceeded

      Expected results:

      Additional info:

      $ oc get nncp -A -oyaml
      apiVersion: v1
      items:
      - apiVersion: nmstate.io/v1
        kind: NodeNetworkConfigurationPolicy
        metadata:
          annotations:
            kubectl.kubernetes.io/last-applied-configuration: |
              {"apiVersion":"nmstate.io/v1","kind":"NodeNetworkConfigurationPolicy","metadata":{"annotations":{},"name":"mode"},"spec":{"desiredState":{"interfaces":[{"description":"Change mode","link-aggregation":{"mode":"active-backup"},"name":"bond0","state":"up","type":"bond"}]},"maxUnavailable":3,"nodeSelector":{"node-role.kubernetes.io/worker":""}}}
            nmstate.io/webhook-mutating-timestamp: "1726509854030132671"
          creationTimestamp: "2024-09-16T18:04:14Z"
          generation: 1
          name: mode
          resourceVersion: "79454"
          uid: 7acb5e9e-f374-4b45-a70f-c5fba2dfc4bd
        spec:
          desiredState:
            interfaces:
            - description: Change mode
              link-aggregation:
                mode: active-backup
              name: bond0
              state: up
              type: bond
          maxUnavailable: 3
          nodeSelector:
            node-role.kubernetes.io/worker: ""
        status:
          conditions:
          - lastHeartbeatTime: "2024-09-16T18:04:18Z"
            lastTransitionTime: "2024-09-16T18:04:18Z"
            message: 2/2 nodes successfully configured
            reason: SuccessfullyConfigured
            status: "True"
            type: Available
          - lastHeartbeatTime: "2024-09-16T18:04:18Z"
            lastTransitionTime: "2024-09-16T18:04:18Z"
            reason: SuccessfullyConfigured
            status: "False"
            type: Degraded
          - lastHeartbeatTime: "2024-09-16T18:04:18Z"
            lastTransitionTime: "2024-09-16T18:04:18Z"
            reason: ConfigurationProgressing
            status: "False"
            type: Progressing
          lastUnavailableNodeCountUpdate: "2024-09-16T18:04:18Z"
      kind: List
      metadata:
        resourceVersion: "" 

            ellorent Felix Enrique Llorente Pastora
            rhn-cnf-elevin Evgeny Levin
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: