Loading...

XML

Word

Printable

Type: Feature
Resolution: Done
Priority: Normal
Fix Version/s: rhwa-25.8
Affects Version/s: None
Component/s: Fence Agents Remediation
Labels:
None

Blocked:
False
Ready:
False
Release Note Text:

Hide
Cause: The default scheduler may place 2 FAR replicas on the same node, especially when fewer nodes are available.
Consequence: The fence-agents-controller-manager replicas may be scheduled on the same node, which reduces the effectiveness of high availability and leads to a single point of failure.
Fix: FAR deployment includes a podAntiAffinity rule preferredDuringSchedulingIgnoredDuringExecution with topologyKey: kubernetes.io/hostname.
Result: The default scheduler prefers placing the FAR replica on a node that does not already have a replica.

Show
Cause: The default scheduler may place 2 FAR replicas on the same node, especially when fewer nodes are available. Consequence: The fence-agents-controller-manager replicas may be scheduled on the same node, which reduces the effectiveness of high availability and leads to a single point of failure. Fix: FAR deployment includes a podAntiAffinity rule preferredDuringSchedulingIgnoredDuringExecution with topologyKey: kubernetes.io/hostname. Result: The default scheduler prefers placing the FAR replica on a node that does not already have a replica.
Release Note Type:
Feature
Release Note Status:
Proposed

Target Version:

rhwa-25.8

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

The fence-agents-controller-manager replicas may be scheduled on the same node, which reduces the effectiveness of high availability.

This issue can occur when nodes are being brought back online one by one, such as during maintenance.
In such cases, both replicas may be placed on a single node, creating a single point of failure.

This patch introduces the following change:

[Before merging this patch]
The default scheduler may place 2 replicas on the same node, especially when fewer nodes are available.

[After merging this patch]
The fence-agents-controller-manager deployment includes a podAntiAffinity rule using requiredDuringSchedulingIgnoredDuringExecution, ensuring that replicas are scheduled on separate nodes.

Scheduling both replicas on the same node introduces a single point of failure and should be avoided in HA configurations.
In situations where only one node is available, such as during planned maintenance or recovery, this can lead to delays or missed remediation.
To help improve fault tolerance, it is recommended to use podAntiAffinity rules so that each replica runs on a different node.
By preventing both replicas from running on the same node, this setup enhances the resilience of the remediation process.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

podaffinity_4_16_connected_9_sept.text
44 kB
2025/09/09 12:29 PM

is cloned by

RHWA-308 [NHC] Improve HA by enforcing podAntiAffinity for controller-manager replicas

Closed

relates to

RHWA-364 [FAR] Improve HA by using 'topologySpreadConstraints' to enforce strict pod distribution for fence-agents-controller-manager replicas

Review

links to

medik8s/fence-agents-remediation#178: Add podAntiAffinity to controller-manager to improve HA

Assignee:: Or Raz

Reporter:: KATSUYA KAWAKAMI

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/08/14 6:18 PM

Updated:: 2025/10/11 7:39 AM

Resolved:: 2025/09/14 6:57 AM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty

Hide