Loading...

XML

Word

Printable

Type: Bug
Resolution: Won't Do
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.16.z
Component/s: Networking / ovn-kubernetes
Labels:
- SDN:OVNK:EgressIP

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
CORENET Sprint 275
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
PX Priority Data:
PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Over 600 Egress IPs on a cluster
6 Egress-assignable node hosts
130 nodes
Observe in a given project that one egressIP is assigned properly to the namespace, by two matchLabel rules for the namespace.
Observe that in that project, 6 pods are correctly bound to this egressIP (have matching/expected SNAT entries for this egressIP address)
Observe that 2 pods are errantly bound to different egress IP objects (have matching SNAT entries for those two separate egress IPs) - but those egress IP objects do NOT match this target project or have podSelector rules that would otherwise acquire these pods
Observe also that at least one of the pods that has NO egress SNAT rule has duplicate `logical_ip` entries shared across 3 nodes. (There is a SNAT to host IP nat table entry for this pod on 3 separate nodes --> not tied to egressIP) See data highlights below.
Observe also that multiple pods don't have nat entries at all even for their localhost machine which may indicate they were deleted prior to the nat dump being taken.

Version-Release number of selected component (if applicable):

Observed on OpenShift 4.16.17

How reproducible:

One time, have not built internal replicator

Steps to Reproduce:

unclear

Actual results:

Duplicated logical_ip entries could be causing egressIP nat handling checks to fail since there are more than 1 nat entries...
egressIP allocation rules could be failing to apply correctly leading to misallocated flow state?

Expected results:

EgressIP allocation logic should be consistent - we should not encounter an issue where pods in namespace A can be allocated to egressIP B when no matchlabel rules apply.
EgressIP binding for pods in a given namespace should impact ALL pods in that namespace unless otherwise selectively omitted via selection rules
NAT entries for a given pod should only show up once on a host for SNAT to node IP routing, and once again on the egressIP host for snat to egress routing, not 3x across 3 nodes for snat to host IP.

Additional info:

Please fill in the following template while reporting a bug and provide as much relevant information as possible. Doing so will give us the best chance to find a prompt resolution.

Affected Platforms:

Red Hat OpenShift Container Platform: Customer issue.
Case number:
04072915

//Attachments in supportshell:
0020-inspect.local.3368678593394018855.zip --> project inspect with impacted pods
0010-egress_problem.zip --> must-gather
0060-Archive.zip --> NAT table dump and egressIP cross-check output from kcs: https://access.redhat.com/solutions/7110252

//DATA REQUESTED/PENDING:

sosreports from the egressIP assignable nodes
network-log must-gather
namespace inspect from openshift-ovn-kubernetes and openshift-multus

//workaround suggested:

ovnkube DB rebuild on all egress-assignable hosts to rebuild the egress SNAT entry lists and refresh bind rules for all applicable pods (after data-gather)

Assignee:: Martin Kennelly

Reporter:: Will Russell

Need Info From:: None

Contributors:: Ketan Lakhwara

QA Contact:: Jean Chen

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 2025/04/07 6:18 PM

Updated:: 2025/09/24 8:17 PM

Resolved:: 2025/09/15 11:04 AM

Details

Description

04072915

Attachments

Easy Agile Planning Poker

Activity

People

Dates