Loading...

XML

Word

Printable

Type: Epic
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Labels:

Epic Name:
Enforce PDB unhealthyEvictionPolicy in OpenShift
Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Epic Status:
To Do

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

All PDBs in OpenShift should consider using AlwaysAllow in .spec.unhealthyEvictionPolicy.

It allows eviction of unhealthy (not ready) pods even if there are no disruptions allowed on a PodDisruptionBudget. This can help to drain/maintain a node and recover without a manual intervention when multiple instances of nodes or pods are misbehaving. Use this with caution, as this option can disrupt perspective pods that have not yet had a chance to become healthy.

Example PRs:

The default or IfHealthyBudget policy should be used only in special cases where the operand availability is critical. For example etcd: https://github.com/openshift/cluster-etcd-operator/pull/1171/commits/647af2f5002a4f6c5846e885eb2643916394a21e

EDIT: etcd in OCP should be fine with AlwaysAllow, but possibly problematic in hypershift? https://redhat-internal.slack.com/archives/CKJR6200N/p1724775333783439

This policy achieves the least amount of disruption, as it does not allow eviction when multiple etcd pods do not report readiness. This can block node drain/maintenance. The cluster administrator should then analyze these pods and decide which one to bring down manually.

This should be communicated to all PDB owners.
Usage of AlwaysAllow policy should be enforced by a test. There should be only a handful of exceptions (e.g. etcd)

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates