Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 4.18.0
Affects Version/s: 4.18.z, 4.19.z, 4.20.0
Component/s: kube-apiserver
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
None

Target Backport Versions:
None
Target Version:

4.18.z
Release Blocker:
Rejected
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
Done
Release Note Type:
Enhancement
Release Note Text:

Hide
Previously, kube-apiserver was configured with `goaway-chance` set to 0, meaning kube-apiserver never sent GOAWAY frame and connections were never reset and we reusing the same kube-apiserver instance.
Now, in order to rebalance connections between kube-apiserver instances, apiserver is configured to send GOAWAY frame with the probability of 1% after every frame. This ensures that connection can be reset and reconnected to a different kube-apiserver, improving platform stability and avoids cache bloat on particular kube-apiserver instances.

This change has no effect on single node installations

Show
Previously, kube-apiserver was configured with `goaway-chance` set to 0, meaning kube-apiserver never sent GOAWAY frame and connections were never reset and we reusing the same kube-apiserver instance. Now, in order to rebalance connections between kube-apiserver instances, apiserver is configured to send GOAWAY frame with the probability of 1% after every frame. This ensures that connection can be reset and reconnected to a different kube-apiserver, improving platform stability and avoids cache bloat on particular kube-apiserver instances. This change has no effect on single node installations

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

This is a clone of issue ~~OCPBUGS-60121~~. The following is the description of the original issue:
—
This is a clone of issue ~~OCPBUGS-43521~~. The following is the description of the original issue:
—
When master nodes or kube api servers are taken offline, for mc update, revision rollout, etc .. a single kube api server will contain the majority of long lived connection and live connections.
After all 3 masters are back online a single kube-apiserver will continue to receive the majority of live api connections resulting in the master node cpu hitting 100%

Restarting the kube api server pod resolves the issue.

The expectation is that after the 3 masters are up the live api connections would get balanced between the 3 master nodes.

Looking for assistance in determining why the live connections are not getting evenly distributed between the kube api servers whenever quorum is reestablished.

clones

OCPBUGS-60121 uneven distribution of kube api traffic

Closed

is blocked by

OCPBUGS-60121 uneven distribution of kube api traffic

Closed

links to

openshift/cluster-kube-apiserver-operator#1912: [release-4.18] OCPBUGS-61039: Set goaway chance to 0.001

Assignee:: Vadim Rutkovsky (Inactive)

Reporter:: Daniel Seals

Need Info From:: None

Contributors:: None

QA Contact:: Ke Wang

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2025/08/29 3:13 AM

Updated:: 2025/09/17 7:55 AM

Resolved:: 2025/09/17 7:55 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates