Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-64597

frr-k8s pods in CrashLoopBackOff state after EUS update from 4.16 to 4.18

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • None
    • CNF Network Sprint 279
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      After EUS update from 4.16 to 4.18 frr-k8s pods on nodes with VRFs configured are in the CrashLoopBackOff state.
      When checking frr's logs it reports:
          
      2025/11/03 14:05:20.735 BGP: [ZXFVW-H54SV] Rx Intf up VRF 260 IF bond0.357vrf
      2025/11/03 14:05:20.735 BGP: [ZXFVW-H54SV] Rx Intf up VRF 260 IF bond0.357vrf
      ZEBRA: Received signal 11 at 1762178720 (si_addr 0x55c52f4e1c98, PC 0x7f99b1f5a922); aborting...
      ZEBRA: /usr/lib64/frr/libfrr.so.0(zlog_backtrace_sigsafe+0x71) [0x7f99b1f27b01]
      ZEBRA: /usr/lib64/frr/libfrr.so.0(zlog_signal+0xf5) [0x7f99b1f27d05]
      ZEBRA: /usr/lib64/frr/libfrr.so.0(+0xcf465) [0x7f99b1f50465]
      ZEBRA: /lib64/libc.so.6(+0x3e6f0) [0x7f99b1c856f0]
      ZEBRA: /usr/lib64/frr/libfrr.so.0(route_node_get+0x72) [0x7f99b1f5a922]
      ZEBRA: /usr/lib64/frr/libfrr.so.0(srcdest_rnode_get+0x14) [0x7f99b1f51154]
      ZEBRA: /usr/libexec/frr/zebra(+0xe83ad) [0x55c52b9fa3ad]
      ZEBRA: /usr/lib64/frr/libfrr.so.0(work_queue_run+0x88) [0x7f99b1f67588]
      ZEBRA: /usr/lib64/frr/libfrr.so.0(thread_call+0x81) [0x7f99b1f60b71]
      ZEBRA: /usr/lib64/frr/libfrr.so.0(frr_run+0xe8) [0x7f99b1f233c8]
      ZEBRA: /usr/libexec/frr/zebra(main+0x386) [0x55c52b995366]
      ZEBRA: /lib64/libc.so.6(+0x29590) [0x7f99b1c70590]
      ZEBRA: /lib64/libc.so.6(__libc_start_main+0x80) [0x7f99b1c70640]
      ZEBRA: /usr/libexec/frr/zebra(_start+0x25) [0x55c52b995a55]
      ZEBRA: in thread work_queue_run scheduled from lib/workqueue.c:136 work_queue_schedule()
      

      Version-Release number of selected component (if applicable):

      OCP 4.18.27
      metallb-operator.v4.18.0-202510210939
      frr-8.5.3-4.el9.x86_64
          

      How reproducible:

      So far happened on 2 different setups
          

      Steps to Reproduce:

          1. Deploy baremetal dualstack OCP cluster 4.16 version per RDS Core spec.
          2. Perform EUS update to 4.18
          3. Check frr-k8s pods that run on a nodes with VRFs configured
          

      Actual results:

      frr-k8s pods running on the nodes with VRFs configured are in CrashLoopBackOff state
          

      Expected results:

      frr-k8s pods running on the nodes with VRFs configured are running without issues
          

              fpaoline@redhat.com Federico Paolinelli
              yprokule@redhat.com Yurii Prokulevych
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: