-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.14.z
-
Important
-
None
-
False
-
-
Customer Escalated
Description of problem:
There seems to exist a lot of issues with dual stack implementation with egressIPs and in this case when egressIPs are configured on additional networks.
Customer has 2 nodes to work as egressIP nodes configured and these nodes have a set of VLANs to handle the egress traffic for the egressIPs configured. In general there aren't many egressIP CRs (5/6 maximum) and each egressIP all have 2 IPs, for one per node.
So far it looks like we have different issues that perhaps require separate Jiras, but I'm not sure if all combined is not the main problem affecting the customer.
- To start we are only seeing in the .status.items one IP of one version per node. This seems related with the PR we mentioned here:
- In the CRD this is mostly informational, but the problem is bigger than not just seeing it displayed in the status fields. Even though the CRD allows combination of both IP versions, it looks like OVN can't processed as such:
1. The IP version chosen seems random and one is chosen the other one is not. For example yesterday I recreated my EIP with dual stack and IPv6 was the one only being assigned and configured in the LRPs and nodes. Today after booting the cluster, IPv4 was chosen and only this is created in OVN:
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
creationTimestamp: "2024-11-12T15:06:20Z"
generation: 5
name: egress-agnhost-websrv
resourceVersion: "5689602"
uid: b6597209-8fb9-472e-96e1-26cf61c3f387
spec:
egressIPs:
- 172.23.183.20
- 172.23.183.21
- fdca:5d7b:fdda:f266::28:7b90
- fdca:5d7b:fdda:f266::28:7b6f
namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: agnhost-websrv-testing
podSelector: {}
status:
items:
- egressIP: 172.23.183.21
node: infra-0.prod-openshift4.redhatrules.local
- egressIP: 172.23.183.20
node: infra-1.prod-openshift4.redhatrules.local
-------------------------------------------
LRP on ovnkube-node-cmhkx
-------------------------------------------
_uuid : 075e3964-b9bc-4e90-bc2c-0000d30e55b3
action : reroute
external_ids : {name=egress-agnhost-websrv}
match : "ip4.src == 10.192.14.18"
nexthop : []
nexthops : ["10.192.6.2"]
options : {}
priority : 100
_uuid : 96b2506c-7f2e-41c9-8ebd-50f84a183150
action : reroute
external_ids : {name=egress-agnhost-websrv}
match : "ip4.src == 10.192.14.28"
nexthop : []
nexthops : ["10.192.6.2"]
options : {}
priority : 100
_uuid : a7f04cdd-ccf2-4670-8d20-6fd4cf3a813b
action : reroute
external_ids : {name=egress-agnhost-websrv}
match : "ip4.src == 10.192.12.31"
nexthop : []
nexthops : ["10.192.6.2"]
options : {}
priority : 100
_uuid : 2b9885bd-fdef-4a7f-85fd-7f7a34a416d2
action : reroute
external_ids : {name=egress-agnhost-websrv}
match : "ip4.src == 10.192.12.24"
nexthop : []
nexthops : ["10.192.6.2"]
options : {}
priority : 100
-------------------------------------------
LRP on ovnkube-node-bsqb7
-------------------------------------------
_uuid : 196166b8-a256-40a5-8295-f9f6429c0183
action : reroute
external_ids : {name=egress-agnhost-websrv}
match : "ip4.src == 10.192.14.28"
nexthop : []
nexthops : ["100.88.0.3", "100.88.0.6"]
options : {}
priority : 100
_uuid : c38a2a9b-87a8-496e-9702-cfc7caf05414
action : reroute
external_ids : {name=egress-agnhost-websrv}
match : "ip4.src == 10.192.14.18"
nexthop : []
nexthops : ["100.88.0.3", "100.88.0.6"]
options : {}
If I looked at the node there is also no reference to IPv6 addresses:
https://privatebin.corp.redhat.com/?72d65eb8050416cb#6E5Jf7vNY1EFxbQPe7Mf3KdckjXvsVwD8wnQhd61NMJG
2. If this changes it can create issues on the ongoing connections and it seems to show that dual stack is not really supported with egressIPs as it is expected by the customer.
Version-Release number of selected component (if applicable):
OCP 4.14 on bare metal
How reproducible:
Unknown. So far it is hard to say what actually triggers the issue with inconsistent DB entries for the LRPs.
Steps to Reproduce:
1. Enable dual-stack network
2. Create egressIPs with dual stack on additional networks
- account is impacted by
-
OCPBUGS-44793 [4.14] EgressIP intermittent connection timeout while communicating with external services
- Closed