Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62681

BFD not enabling in ovn pods

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • x86_64
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The customer wanted to enable the BFD protocol in the OVN pods but after setting bfd=enable the configuration vanishes from the pod.

      OCP 4.18.21

      ovn-nbctl --version
      ovn-nbctl 24.09.3
      Open vSwitch Library 3.4.3
      DB Schema 7.6.0

      Enabled the BFD: (was necessary to issue the command two times to enable it)

      ovn-nbctl -v set Logical_Router_Port rtoe-GR_master-0.clu04238914v2.lab.upshift.rdu2.redhat.com options:bfd=enabled

      Checked the config (it will vanish in a few minutes)

      [root@master-0 ~]# ovn-nbctl list Logical_Router_Port rtoe-GR_master-0.clu04238914v2.lab.upshift.rdu2.redhat.com
      _uuid : 907598d9-ec50-4946-8d27-66a60f160d9f
      dhcp_relay : []
      enabled : true
      external_ids :

      {gateway-physical-ip=yes}
      gateway_chassis : []
      ha_chassis_group : []
      ipv6_prefix : []
      ipv6_ra_configs : {}
      mac : "fa:16:3e:e3:39:db"
      name : rtoe-GR_master-0.clu04238914v2.lab.upshift.rdu2.redhat.com
      networks : ["10.0.89.230/21"]
      options : {bfd=enabled}
      peer : "10.0.94.79"
      status : {}

      Also included a peer IP address but it it does no work (using FRR installed through dnf)

      [quickcluster@upi-0 cgroup]$ sudo vtysh

      Hello, this is FRRouting (version 8.5.3).
      Copyright 1996-2005 Kunihiro Ishiguro, et al.

      upi-0.clu04238914v2.lab.upshift.rdu2.redhat.com# show bfd peer
      BFD Peers:
      peer 10.0.89.230 vrf default (peer is the IP from the ovn pod)
      ID: 3062223890
      Remote ID: 0
      Active mode
      Status: down
      Downtime: 5 day(s), 22 hour(s), 59 minute(s), 4 second(s)
      Diagnostics: ok
      Remote diagnostics: ok
      Peer Type: configured
      RTT min/avg/max: 0/0/0 usec
      Local timers:
      Detect-multiplier: 3
      Receive interval: 300ms
      Transmission interval: 300ms
      Echo receive interval: 50ms
      Echo transmission interval: disabled
      Remote timers:
      Detect-multiplier: 3
      Receive interval: 1000ms
      Transmission interval: 1000ms
      Echo receive interval: disabled

      peer 10.0.94.79 vrf default (peer is the own IP from the node)
      ID: 3482460252
      Remote ID: 3482460252
      Active mode
      Status: up
      Uptime: 3 day(s), 3 hour(s), 41 minute(s), 5 second(s)
      Diagnostics: ok
      Remote diagnostics: ok
      Peer Type: configured
      RTT min/avg/max: 0/0/0 usec
      Local timers:
      Detect-multiplier: 3
      Receive interval: 300ms
      Transmission interval: 300ms
      Echo receive interval: 50ms
      Echo transmission interval: disabled
      Remote timers:
      Detect-multiplier: 3
      Receive interval: 300ms
      Transmission interval: 300ms
      Echo receive interval: 50ms

      Ping work from both sides:

      From pod to node:

      ping -c 4 10.0.94.79
      PING 10.0.94.79 (10.0.94.79) 56(84) bytes of data.
      64 bytes from 10.0.94.79: icmp_seq=1 ttl=64 time=2.16 ms
      64 bytes from 10.0.94.79: icmp_seq=2 ttl=64 time=4.85 ms
      64 bytes from 10.0.94.79: icmp_seq=3 ttl=64 time=0.241 ms
      64 bytes from 10.0.94.79: icmp_seq=4 ttl=64 time=0.324 ms

      — 10.0.94.79 ping statistics —
      4 packets transmitted, 4 received, 0% packet loss, time 3034ms
      rtt min/avg/max/mdev = 0.241/1.894/4.849/1.870 ms



      From node to pod:

      ping -c 4 10.0.89.230
      upi-0.clu04238914v2.lab.upshift.rdu2.redhat.com# exit
      [quickcluster@upi-0 cgroup]$ ping -c 4 10.0.89.230
      PING 10.0.89.230 (10.0.89.230) 56(84) bytes of data.
      64 bytes from 10.0.89.230: icmp_seq=1 ttl=64 time=6.04 ms
      64 bytes from 10.0.89.230: icmp_seq=2 ttl=64 time=5.06 ms
      64 bytes from 10.0.89.230: icmp_seq=3 ttl=64 time=0.534 ms
      64 bytes from 10.0.89.230: icmp_seq=4 ttl=64 time=0.312 ms

      — 10.0.89.230 ping statistics —
      4 packets transmitted, 4 received, 0% packet loss, time 3021ms
      rtt min/avg/max/mdev = 0.312/2.986/6.043/2.587 ms




      After configuring the BFD it vanishes:

      ovn-nbctl list Logical_Router_Port rtoe-GR_master-0.clu04238914v2.lab.upshift.rdu2.redhat.com
      _uuid : 907598d9-ec50-4946-8d27-66a60f160d9f
      dhcp_relay : []
      enabled : true
      external_ids : {gateway-physical-ip=yes}

      gateway_chassis : []
      ha_chassis_group : []
      ipv6_prefix : []
      ipv6_ra_configs : {}
      mac : "fa:16:3e:e3:39:db"
      name : rtoe-GR_master-0.clu04238914v2.lab.upshift.rdu2.redhat.com
      networks : ["10.0.89.230/21"]
      options : {}
      peer : "10.0.94.79"
      status : {}
      [root@master-0 ~]#

      And no routes with bfd marked:

      IPv4 Routes
      Route Table <main>:
      169.254.0.0/17 169.254.0.4 dst-ip rtoe-GR_master-0.clu04238914v2.lab.upshift.rdu2.redhat.com
      10.128.0.0/14 100.64.0.1 dst-ip
      0.0.0.0/0 10.0.95.254 dst-ip rtoe-GR_master-0.clu04238914v2.lab.upshift.rdu2.redhat.com
      [root@master-0 ~]# ovn-nbctl -v

      The customer confirms that one cluster works (named ocp7) and the other one does not (ocp22).

      Must gathers in the ticket 04262686 will be shared in Google Drive link.

              bbennett@redhat.com Ben Bennett
              rhn-support-fcardoso Fabio Cardoso
              None
              None
              Anurag Saxena Anurag Saxena
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: