Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-76628

Several nodes degraded as OVN controller fails to add "k8s.ovn.org/node-subnets" annotation

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • 4.17.z, 4.18.z
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      • Several nodes reporting NotReady state in two separate occurrences and 4.y.z-streams.
      • At a closer look, during the first occurrence, this was seen to be due to a missing "k8s.ovn.org/node-subnets" annotation.
        • A secondary occurrence was showing the same behaviour, this time during the upgrade to 4.18.28, blocking the cluster upgrade.

      Version-Release number of selected component (if applicable):

      • We've faced this issue twice with the same cluster: initially within 4.17.43 and later during the upgrade to 4.18.28 (this issue is currently blocking the cluster upgrade)

      Additional info:

      • There are commonalities with the behaviour observed in the the parallel OCPBUGS-66267:
        • The issue is only observed in a cluster within a cluster with Hybrid Overlay enabled (65 other environments without hybrid overlay are not seeing similar behaviors).
        • We're seeing V4SubnetAllocationThresholdExceeded firing in a cluster in a cluster that is well within [0] the clusterNetwork capacity.
          • At a closer look, the ovnkube_clustermanager_allocated_v4_host_subnets on the impacted NotReady [0] nodes  and were reported to reach the ovnkube_clustermanager_num_v4_host_subnets.
          • Restarting the OVN control plane pods restored the value of the ovnkube_clustermanager_allocated_v4_host_subnets, but neither this nor restarting the ovnkube-node pod on the impacted nodes remediated the missing annotation on the nodes.

       

      [0]

      $ oc get no --no-headers | wc -l
      40
      $ oc get no --no-headers | grep "NotReady" | wc -l
      13
      

      [1]

      2026-01-27T10:20:19.076792681Z I0127 10:20:19.076747  101870 default_node_network_controller.go:779] Waiting for node $NODE to start, no annotation found on node for subnet: could not find "k8s.ovn.org/node-subnets" annotation
      2026-01-27T10:20:19.576630275Z I0127 10:20:19.576583  101870 default_node_network_controller.go:779] Waiting for node $NODE to start, no annotation found on node for subnet: could not find "k8s.ovn.org/node-subnets" annotation
      2026-01-27T10:20:20.075777353Z I0127 10:20:20.075725  101870 default_node_network_controller.go:779] Waiting for node $NODE to start, no annotation found on node for subnet: could not find "k8s.ovn.org/node-subnets" annotation 

      [2]

      $ CIDR=`oc get network cluster -o json | jq -r '.spec.clusterNetwork[].cidr'`
      $ PFX=`oc get network cluster -o json | jq -r '.spec.clusterNetwork[].hostPrefix'`
      $ ipcalc -S $PFX $CIDR
      
      $ ipcalc -S $PFX $CIDR | tail -2
      Total:    256
      Hosts/Net:  254
      

       

              asuryana Aswin Suryanarayanan
              rhn-support-rsandu Robert Sandu
              Anurag Saxena Anurag Saxena
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: