-
Bug
-
Resolution: Won't Do
-
Undefined
-
None
-
4.18.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
No
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem: [BGP UDN EIP pre-merge testing] legacy EIP (using unused IP from same node subnet) is not advertised on UDN
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. on routeAdvertisement enabled and FRR additionalRoutingCapabilities enabled cluster, applied receive_all.yaml and ra.yaml to have default network routes advertised
$ oc get frrconfiguration -A
NAMESPACE NAME AGE
openshift-frr-k8s ovnk-generated-295j4 67m
openshift-frr-k8s ovnk-generated-6675n 67m
openshift-frr-k8s ovnk-generated-6mscl 67m
openshift-frr-k8s ovnk-generated-ct5h5 67m
openshift-frr-k8s ovnk-generated-g8rj7 67m
openshift-frr-k8s ovnk-generated-ql8bj 67m
openshift-frr-k8s ovnk-generated-rh769 67m
openshift-frr-k8s ovnk-generated-x4twj 67m
openshift-frr-k8s receive-all 67m
$oc get ra -A
NAME STATUS
default Accepted
$ ip route show | grep bgp
10.128.0.0/23 via 192.168.111.22 dev offloadbm proto bgp metric 20
10.128.2.0/23 via 192.168.111.24 dev offloadbm proto bgp metric 20
10.129.0.0/23 via 192.168.111.20 dev offloadbm proto bgp metric 20
10.129.2.0/23 via 192.168.111.23 dev offloadbm proto bgp metric 20
10.130.0.0/23 via 192.168.111.21 dev offloadbm proto bgp metric 20
10.130.2.0/23 via 192.168.111.47 dev offloadbm proto bgp metric 20
10.131.0.0/23 via 192.168.111.25 dev offloadbm proto bgp metric 20
10.131.2.0/23 via 192.168.111.40 dev offloadbm proto bgp metric 20
2. created an UDN namespace, created a layer3 UDN in the UDN namespace, labelled the UDN with app=udn which matches the networkSelector in the UDN routeAdvertisement, applied ra_udn.yaml, verified UDN pod network was advertised.
$ oc get ns e2e-test-udn-networking-105ei4nw-85xtt --show-labels
NAME STATUS AGE LABELS
e2e-test-udn-networking-105ei4nw-85xtt Active 12m k8s.ovn.org/primary-user-defined-network=null,kubernetes.io/metadata.name=e2e-test-udn-networking-105ei4nw-85xtt,org=qe,pod-security.kubernetes.io/audit-version=latest,pod-security.kubernetes.io/audit=restricted,pod-security.kubernetes.io/enforce-version=latest,pod-security.kubernetes.io/enforce=restricted,pod-security.kubernetes.io/warn-version=latest,pod-security.kubernetes.io/warn=restricted
$ oc get userdefinednetwork layer3-udn-99999 -n e2e-test-udn-networking-105ei4nw-85xtt --show-labels
NAME AGE LABELS
layer3-udn-99999 11m app=udn
$ oc get ra -A
NAME STATUS
default Accepted
ra-udn Accepted
$ oc get ra ra-udn -oyaml
apiVersion: k8s.ovn.org/v1
kind: RouteAdvertisements
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"k8s.ovn.org/v1","kind":"RouteAdvertisements","metadata":{"annotations":{},"name":"ra-udn"},"spec":{"advertisements":["PodNetwork","EgressIP"],"networkSelector":{"matchLabels":{"app":"udn"}}}}
creationTimestamp: "2025-02-17T22:21:11Z"
generation: 1
name: ra-udn
resourceVersion: "124091"
uid: 1b967887-1524-4684-9028-467874b1bfe5
spec:
advertisements:
- PodNetwork
- EgressIP
networkSelector:
matchLabels:
app: udn
status:
conditions:
- lastTransitionTime: "2025-02-17T22:21:12Z"
message: ovn-kubernetes cluster-manager validated the resource and requested the
necessary configuration changes
observedGeneration: 1
reason: Accepted
status: "True"
type: Accepted
status: Accepted
$ ip route show | grep bgp
10.128.0.0/23 via 192.168.111.22 dev offloadbm proto bgp metric 20
10.128.2.0/23 via 192.168.111.24 dev offloadbm proto bgp metric 20
10.129.0.0/23 via 192.168.111.20 dev offloadbm proto bgp metric 20
10.129.2.0/23 via 192.168.111.23 dev offloadbm proto bgp metric 20
10.130.0.0/23 via 192.168.111.21 dev offloadbm proto bgp metric 20
10.130.2.0/23 via 192.168.111.47 dev offloadbm proto bgp metric 20
10.131.0.0/23 via 192.168.111.25 dev offloadbm proto bgp metric 20
10.131.2.0/23 via 192.168.111.40 dev offloadbm proto bgp metric 20
10.150.0.0/24 via 192.168.111.23 dev offloadbm proto bgp metric 20
10.150.1.0/24 via 192.168.111.24 dev offloadbm proto bgp metric 20
10.150.2.0/24 via 192.168.111.22 dev offloadbm proto bgp metric 20
10.150.3.0/24 via 192.168.111.47 dev offloadbm proto bgp metric 20
10.150.4.0/24 via 192.168.111.20 dev offloadbm proto bgp metric 20
10.150.5.0/24 via 192.168.111.25 dev offloadbm proto bgp metric 20
10.150.6.0/24 via 192.168.111.40 dev offloadbm proto bgp metric 20
10.150.7.0/24 via 192.168.111.21 dev offloadbm proto bgp metric 20
3. Labeled a node as egressNode
4. Got an unused IP from same node subnet (I called it legacy egressIP) and created an egressIP object with it, in the meanwhile, label the UDN namespace with a label that matches the namespaceSelector of the egressIP object
$ oc get egressips.k8s.ovn.org
NAME EGRESSIPS ASSIGNED NODE ASSIGNED EGRESSIPS
egressip-99999 192.168.111.160 openshift-qe-025.lab.eng.rdu2.redhat.com 192.168.111.160
$ oc get egressips.k8s.ovn.org egressip-99999 -oyaml
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
annotations:
k8s.ovn.org/egressip-mark: "50000"
creationTimestamp: "2025-02-17T22:21:43Z"
generation: 2
name: egressip-99999
resourceVersion: "124268"
uid: 6176be30-e0ab-4b86-a869-474232472a2e
spec:
egressIPs:
- 192.168.111.160
namespaceSelector:
matchLabels:
org: qe
podSelector:
matchLabels:
color: pink
status:
items:
- egressIP: 192.168.111.160
node: openshift-qe-025.lab.eng.rdu2.redhat.com
$ip route show | grep bgp
10.128.0.0/23 via 192.168.111.22 dev offloadbm proto bgp metric 20
10.128.2.0/23 via 192.168.111.24 dev offloadbm proto bgp metric 20
10.129.0.0/23 via 192.168.111.20 dev offloadbm proto bgp metric 20
10.129.2.0/23 via 192.168.111.23 dev offloadbm proto bgp metric 20
10.130.0.0/23 via 192.168.111.21 dev offloadbm proto bgp metric 20
10.130.2.0/23 via 192.168.111.47 dev offloadbm proto bgp metric 20
10.131.0.0/23 via 192.168.111.25 dev offloadbm proto bgp metric 20
10.131.2.0/23 via 192.168.111.40 dev offloadbm proto bgp metric 20
10.150.0.0/24 via 192.168.111.23 dev offloadbm proto bgp metric 20
10.150.1.0/24 via 192.168.111.24 dev offloadbm proto bgp metric 20
10.150.2.0/24 via 192.168.111.22 dev offloadbm proto bgp metric 20
10.150.3.0/24 via 192.168.111.47 dev offloadbm proto bgp metric 20
10.150.4.0/24 via 192.168.111.20 dev offloadbm proto bgp metric 20
10.150.5.0/24 via 192.168.111.25 dev offloadbm proto bgp metric 20
10.150.6.0/24 via 192.168.111.40 dev offloadbm proto bgp metric 20
10.150.7.0/24 via 192.168.111.21 dev offloadbm proto bgp metric 20
Actual results: egressIP on the UDN was not advertised
Expected results: egressIP on the UDN should be advertised
Additional info:
must-gather: https://drive.google.com/file/d/1Irup08oHaH1XQLnK4PBetipHXEBMEtBN/view?usp=drive_link
Please fill in the following template while reporting a bug and provide as much relevant information as possible. Doing so will give us the besroutt chance to find a prompt resolution.
Affected Platforms:
Is it an
- internal CI failure
- customer issue / SD
- internal RedHat testing failure
If it is an internal RedHat testing failure:
- Please share a kubeconfig or creds to a live cluster for the assignee to debug/troubleshoot along with reproducer steps (specially if it's a telco use case like ICNI, secondary bridges or BM+kubevirt).
If it is a CI failure:
- Did it happen in different CI lanes? If so please provide links to multiple failures with the same error instance
- Did it happen in both sdn and ovn jobs? If so please provide links to multiple failures with the same error instance
- Did it happen in other platforms (e.g. aws, azure, gcp, baremetal etc) ? If so please provide links to multiple failures with the same error instance
- When did the failure start happening? Please provide the UTC timestamp of the networking outage window from a sample failure run
- If it's a connectivity issue,
- What is the srcNode, srcIP and srcNamespace and srcPodName?
- What is the dstNode, dstIP and dstNamespace and dstPodName?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
If it is a customer / SD issue:
- Provide enough information in the bug description that Engineering doesn't need to read the entire case history.
- Don't presume that Engineering has access to Salesforce.
- Do presume that Engineering will access attachments through supportshell.
- Describe what each relevant attachment is intended to demonstrate (failed pods, log errors, OVS issues, etc).
- Referring to the attached must-gather, sosreport or other attachment, please provide the following details:
- If the issue is in a customer namespace then provide a namespace inspect.
- If it is a connectivity issue:
- What is the srcNode, srcNamespace, srcPodName and srcPodIP?
- What is the dstNode, dstNamespace, dstPodName and dstPodIP?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
- Please provide the UTC timestamp networking outage window from must-gather
- Please provide tcpdump pcaps taken during the outage filtered based on the above provided src/dst IPs
- If it is not a connectivity issue:
- Describe the steps taken so far to analyze the logs from networking components (cluster-network-operator, OVNK, SDN, openvswitch, ovs-configure etc) and the actual component where the issue was seen based on the attached must-gather. Please attach snippets of relevant logs around the window when problem has happened if any.
- When showing the results from commands, include the entire command in the output.
- For OCPBUGS in which the issue has been identified, label with "sbr-triaged"
- For OCPBUGS in which the issue has not been identified and needs Engineering help for root cause, label with "sbr-untriaged"
- Do not set the priority, that is owned by Engineering and will be set when the bug is evaluated
- Note: bugs that do not meet these minimum standards will be closed with label "SDN-Jira-template"
- For guidance on using this template please see
OCPBUGS Template Training for Networking components