-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.14.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
No
-
None
-
None
-
None
-
In Progress
-
Bug Fix
-
-
None
-
None
-
None
-
None
Description of problem:
AdminPolicyBasedExternalRoute failing to create routes for the pods created after AdminPolicyBasedExternalRoute CR creation. However it is able to create routes for the pods which already exist before AdminPolicyBasedExternalRoute CR creation. This issue is happening on 120 node baremetal environment while running Perf&Scale ICNI2 tests. NOte: we have disbaled BFD in this testing because of https://issues.redhat.com/browse/OCPBUGS-25449
Version-Release number of selected component (if applicable):
4.14.1
How reproducible:
Always
Steps to Reproduce:
#!/bin/bash
set -x
# Create served-ns-1 and serving-ns-1 namespaces:
echo "Create served-ns-1 and serving-ns-1 namespaces"
date -u
cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: Namespace
metadata:
name: served-ns-1
labels:
kubernetes.io/metadata.name: served-ns-1
spec: {}
---
apiVersion: v1
kind: Namespace
metadata:
name: serving-ns-1
labels:
kubernetes.io/metadata.name: serving-ns-1
spec: {}
EOF
sleep 120
# create SRIOV network
date -u
echo "create SRIOV network"
cat <<EOF | kubectl apply -f -
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: sriov-net-1
namespace: openshift-sriov-network-operator
spec:
ipam: |
{
"type": "static"
}
spoofChk: "off"
trust: "on"
resourceName: intelnics2
networkNamespace: serving-ns-1
EOF
sleep 120
# create served pod before AdminPolicyBasedExternalRoute
date -u
echo "create served pod before AdminPolicyBasedExternalRoute"
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: pod-served-before
namespace: served-ns-1
spec:
nodeSelector:
kubernetes.io/hostname: worker010-fc640
containers:
- args:
- sleep
- infinity
name: app
image: quay.io/centos/centos
imagePullPolicy: IfNotPresent
EOF
sleep 120
# create AdminPolicyBasedExternalRoute CR
date -u
echo "create AdminPolicyBasedExternalRoute CR"
cat <<EOF | kubectl apply -f -
apiVersion: k8s.ovn.org/v1
kind: AdminPolicyBasedExternalRoute
metadata:
name: honeypotting
spec:
## gateway example
from:
namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: served-ns-1
nextHops:
dynamic:
- podSelector:
matchLabels:
lb: lb-1
namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: serving-ns-1
networkAttachmentName: serving-ns-1/sriov-net-1
EOF
sleep 120
# create serving pod
date -u
echo "create serving pod"
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: pod-serving-1
namespace: serving-ns-1
labels:
serving: true-1
lb: lb-1
annotations:
k8s.v1.cni.cncf.io/networks: |-
[{
"name": "sriov-net-1",
"ips": [ "192.168.219.2/21" ]
}]
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "serving-ns-1/sriov-net-1",
"interface": "net1",
"ips": [ "192.168.219.2" ],
"dns": {}
}]
spec:
containers:
- name: frr
image: centos
command:
- sleep
- infinity
securityContext:
privileged: true
nodeSelector:
kubernetes.io/hostname: worker003-fc640
EOF
sleep 120
# create served pod after AdminPolicyBasedExternalRoute
date -u
echo "create served pod after AdminPolicyBasedExternalRoute"
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: pod-served-after
namespace: served-ns-1
spec:
nodeSelector:
kubernetes.io/hostname: worker010-fc640
containers:
- args:
- sleep
- infinity
name: app
image: quay.io/centos/centos
imagePullPolicy: IfNotPresent
EOF
sleep 120
date -u
echo "oc get pods -n served-ns-1 -o wide"
oc get pods -n served-ns-1 -o wide
date -u
echo "oc rsh -n openshift-ovn-kubernetes -c nbdb ovnkube-node-vjd8p ovn-nbctl lr-route-list GR_worker010-fc640"
oc rsh -n openshift-ovn-kubernetes -c nbdb ovnkube-node-vjd8p ovn-nbctl lr-route-list GR_worker010-fc640
Actual results:
We see routes only for pod-served-before pod (i.e 10.129.38.10) + echo 'oc get pods -n served-ns-1 -o wide' oc get pods -n served-ns-1 -o wide + oc get pods -n served-ns-1 -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod-served-after 1/1 Running 0 2m 10.129.38.11 worker010-fc640 <none> <none> pod-served-before 1/1 Running 0 8m1s 10.129.38.10 worker010-fc640 <none> <none> + date -u Sat Feb 10 13:04:31 UTC 2024 + echo 'oc rsh -n openshift-ovn-kubernetes -c nbdb ovnkube-node-vjd8p ovn-nbctl lr-route-list GR_worker010-fc640' oc rsh -n openshift-ovn-kubernetes -c nbdb ovnkube-node-vjd8p ovn-nbctl lr-route-list GR_worker010-fc640 + oc rsh -n openshift-ovn-kubernetes -c nbdb ovnkube-node-vjd8p ovn-nbctl lr-route-list GR_worker010-fc640 IPv4 Routes Route Table <main>: 10.129.38.10 192.168.219.2 src-ip rtoe-GR_worker010-fc640 ecmp-symmetric-reply 169.254.169.0/29 169.254.169.4 dst-ip rtoe-GR_worker010-fc640 10.128.0.0/14 100.64.0.1 dst-ip 0.0.0.0/0 192.168.216.1 dst-ip rtoe-GR_worker010-fc640
Expected results:
We should see routes for both pod-served-before pod (i.e 10.129.38.10) and pod-served-after (10.129.38.11) pods.
Additional info:
must-gather - http://storage.scalelab.redhat.com/anilvenkata/must-gather.local.6409385997838158348.tar.gz All the resources in the above case are created between Sat Feb 10 12:52:28 UTC 2024 and Sat Feb 10 13:04:31 UTC 2024. Please use these timestamps in the logs.
- depends on
-
OCPBUGS-29680 AdminPolicyBasedExternalRoute CRD failing to watch and reconcile routes for later pods
-
- Closed
-
- duplicates
-
OCPBUGS-29939 [OVN] Static routes for the AdminPolicyBasedExternalRoute don't get recreated on the gateway routers when pods restart
-
- Closed
-
- is cloned by
-
OCPBUGS-29680 AdminPolicyBasedExternalRoute CRD failing to watch and reconcile routes for later pods
-
- Closed
-
- links to
-
RHBA-2024:1564
OpenShift Container Platform 4.14.z bug fix update