Loading...

XML

Word

Printable

Type: Bug
Resolution: Obsolete
Priority: Major
Fix Version/s: None
Affects Version/s: 4.6
Component/s: Cloud Compute / OpenStack Provider
Labels:
- Triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:

4.11.z
Release Blocker:
Rejected
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Test Coverage:

-

Release Note Status:
None
Release Note Type:
Bug Fix
Release Note Text:

Hide
Previously Keepalived health check was looking at the status of the load-balanced kube-apiserver. This could be problematic when the cluster was recovering from an outage and API server is unreliable. Instead, have Keepalived check for the readyness of HAProxy and let HAProxy manage the API server backends. This prevents unnecessary API VIP failovers.

Show
Previously Keepalived health check was looking at the status of the load-balanced kube-apiserver. This could be problematic when the cluster was recovering from an outage and API server is unreliable. Instead, have Keepalived check for the readyness of HAProxy and let HAProxy manage the API server backends. This prevents unnecessary API VIP failovers.

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

This relates to the recovery of a cluster following an etcd outage.

The ingress path to kube-apiserver is:

───────────> VIP ─────────────────> Local HAProxy ────┬─> kube-apiserver-master-0
    (managed by keepalived)                           │
                                                      ├─> kube-apiserver-master-1
                                                      │
                                                      └─> kube-apiserver-master-2

Each master is running an HAProxy which load balances between the 3 kube-apiservers. Each HAProxy is running health checks against each kube-apiserver, and will add or remove it from the available pool based on its health.

We only use keepalived to ensure that HAProxy is not a single point of failure. It is the job of keepalived to ensure that incoming traffic is being directed to an HAProxy which is functioning correctly.

The current health check we are using for keepalived involves polling /readyz against the local HAProxy. While this seems intuitively correct it is in fact testing the wrong thing. It is testing whether the kube-apiserver it connects to is functioning correctly. However, this is not the purpose of keepalived. HAProxy runs health checks against kube-apiserver backends. keepalived simply selects a correctly functioning HAProxy.

This becomes important during recovery from an outage. When none of the kube-apiservers are healthy this health check will fail continuously, and the API VIP will move uselessly between masters. However the situation is much worse when only one of the kube-apiservers is up. In this case there is a high probability that it is overloaded and at least rate limiting incoming connections. This may lead us to fail the keepalived health check and fail the VIP over to the next HAProxy. This will cause all open kube-apiserver connections to reset, even the established ones. This increases the load on the kube-apiserver and increases the probability that the health check will fail again.

Ideally the keepalived health check would check only the health of HAProxy itself, not the health of the pool of kube-apiservers. In practise it will probably never be necessary to move the VIP while the master is up, regardless of the health of the cluster. A network partition affecting HAProxy would already be handled by VRRP between the masters, so it may be that it would be sufficient to check that the local HAProxy pod is healthy.

clones

OCPBUGS-1257 Keepalived health check causes unnecessary VIP flapping when HAProxy is healthy

Closed

depends on

OCPBUGS-1257 Keepalived health check causes unnecessary VIP flapping when HAProxy is healthy

Closed

Assignee:: Martin André

Reporter:: Matthew Booth

Need Info From:: None

Contributors:: None

QA Contact:: Jon Uriarte

Doc Contact:: None

Involved:: Benjamin Nemec, Martin André

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2022/12/07 2:53 PM

Updated:: 2025/07/28 5:31 PM

Resolved:: 2023/06/15 2:34 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates