Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: 4.21.0
Affects Version/s: 4.16.z
Component/s: Networking / cloud-network-config-controller
Labels:
- SDN:OVNK:EgressIP
- SDN:Platform:CNCC

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:

4.17.z, 4.16.z, 4.18.z, 4.19.z, 4.20.z
Target Version:

4.21.0
Release Blocker:
None
Sprint:
CORENET Sprint 278
sprint_count:
1

Customer Impact:

Customer Escalated, Customer Facing

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

When too many CloudPrivateIPConfig objects are scheduled onto the same node, the cloud-network-config-controller (CNCC) fails to assign them due to underlying cloud provider IP limits. These objects remain in CloudResponseError state indefinitely instead of being redistributed to other available egress-assignable nodes.

Version-Release number of selected component (if applicable):

 4.16.46

How reproducible:

Intermittent, but consistently reproducible when:      

- Cluster has multiple nodes labeled with k8s.ovn.org/egress-assignable.

- Workload requires more egress IPs than a single node can support (e.g., > X secondary IPs on AWS).

- CNCC continues to assign new CloudPrivateIPConfig to the saturated node.

Steps to Reproduce:

    1.
    2.
    3.

Actual results:

- CNCC keeps assigning new IPs to a saturated node.
- CloudPrivateIPConfig objects remain stuck in CloudResponseError.
- No automatic redistribution to other egress-assignable nodes.

Expected results:

- CNCC should detect that a node has reached its IP/ENI limit.
- Scheduler logic should redistribute new or failing CloudPrivateIPConfig objects to other available `egress-assignable` nodes automatically.

Additional info:

OpenShift version: 4.16.46

Cloud provider: AWS

Example error from object status:

status:
  conditions:
  - lastTransitionTime: "2025-08-21T09:41:59Z"
    message: cloud API failed to assign IP: exceeded interface address quota
    reason: CloudResponseError
    status: "False"
    type: Assigned

blocks

OCPBUGS-63542 [AWS, EgressIP] CNCC and OVN-Kubernetes are not handling 0 capacity in cloud env's correctly (0 and unset is not differentiated - 0 capacity is mistaken for Unlimited capacity by OVNK

Closed

is cloned by

OCPBUGS-63542 [AWS, EgressIP] CNCC and OVN-Kubernetes are not handling 0 capacity in cloud env's correctly (0 and unset is not differentiated - 0 capacity is mistaken for Unlimited capacity by OVNK

Closed

split to

OCPBUGS-63348 [AWS, EgressIP] CNCC and OVN-Kubernetes assigns more EgressIPs than available capacity

Verified

links to

openshift/cloud-network-config-controller#183: OCPBUGS-60806: Change the capacity struct from int to ptrOfInt

Assignee:: Surya Seetharaman

Reporter:: Harshal Thakare

Need Info From:: None

Contributors:: None

QA Contact:: Qiong Wang

Doc Contact:: None

Votes:: 3 Vote for this issue

Watchers:: 19 Start watching this issue

Created:: 2025/08/23 6:21 PM

Updated:: 2025/11/19 5:30 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates