Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Major
Fix Version/s: None
Affects Version/s: 4.12, 4.11
Component/s: Networking / openshift-sdn
Labels:
- CNO

Severity:
Moderate
Regression:
None
Story Points:
1
Sprint:
SDN Sprint 240, SDN Sprint 241
sprint_count:
2
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Target Version:

4.13.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

I haven't gone back to pin down all affected versions, but I wouldn't be surprised if we've had this exposure for a while. On a 4.12.0-ec.2 cluster, we have:

cluster:usage:resources:sum{resource="podnetworkconnectivitychecks.controlplane.operator.openshift.io"}

currently clocking in around 67983. I've gathered a dump with:

$ oc --as system:admin -n openshift-network-diagnostics get podnetworkconnectivitychecks.controlplane.operator.openshift.io | gzip >checks.gz

And many, many of these reference nodes which no longer exist (the cluster is aggressively autoscaled, with nodes coming and going all the time). We should fix garbage collection on this resource, to avoid consuming excessive amounts of memory in the Kube API server and etcd as they attempt to list the large resource set.

clones

OCPBUGS-1341 Node churn leaks PodNetworkConnectivityChecks

Closed

depends on

OCPBUGS-1341 Node churn leaks PodNetworkConnectivityChecks

Closed

links to

openshift/cluster-network-operator#1950: [release-4.13] OCPBUGS-17721: Enhance check controller to remove old check objects

RHBA-2023:4905 OpenShift Container Platform 4.13.z bug fix update

Assignee:: Periyasamy Palanisamy

Reporter:: W. Trevor King

QA Contact:: Mike Fiedler

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2023/08/15 1:10 PM

Updated:: 2023/09/05 1:20 AM

Resolved:: 2023/09/05 1:20 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates