Loading...

XML

Word

Printable

Type: Story
Resolution: Done
Priority: Major
Fix Version/s: rhwa-4.21-0
Affects Version/s: None
Component/s: Fence Agents Remediation, Node Healthcheck, Self Node Remediation
Labels:
- docs-RN-needed

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Release Note Text:

Hide
Cause: The RHWA operators used to have inconsistent deployment configuration with regards to replicas, node affinity and update strategy.
Consequence: Potentially slower remediation in case the operator pod was running on an unhealthy node.
Fix: Use 2 replicas for NHC, FAR and SNR, use topologySpreadConstraints for preventing running on the same node, and use updateStrategy for avoiding potential update locks in some edge cases.
Result: Reduced chance of slower remediation.

Rewrite
======
Cause: The RHWA Operators had inconsistent deployment configurations with regards to replicas, node affinity, and the update strategy.
Consequence: In cases where the Operator pod was running on an unhealthy node, this had the potential for remediation to be slower.
Fix: With this release, first use two replicas for Node Health Check (NHC), Fence Agents Remedation (FAR) and Self Node Remedation (SNR). Also use the parameter 'topologySpreadConstraints' for preventing running on the same node. And finally, use the parameter 'updateStrategy' for avoiding potential update locks in some edge cases.
Result: This results in a reduced chance of slower remediation.

Show
Cause: The RHWA operators used to have inconsistent deployment configuration with regards to replicas, node affinity and update strategy. Consequence: Potentially slower remediation in case the operator pod was running on an unhealthy node. Fix: Use 2 replicas for NHC, FAR and SNR, use topologySpreadConstraints for preventing running on the same node, and use updateStrategy for avoiding potential update locks in some edge cases. Result: Reduced chance of slower remediation. Rewrite ====== Cause: The RHWA Operators had inconsistent deployment configurations with regards to replicas, node affinity, and the update strategy. Consequence: In cases where the Operator pod was running on an unhealthy node, this had the potential for remediation to be slower. Fix: With this release, first use two replicas for Node Health Check (NHC), Fence Agents Remedation (FAR) and Self Node Remedation (SNR). Also use the parameter 'topologySpreadConstraints' for preventing running on the same node. And finally, use the parameter 'updateStrategy' for avoiding potential update locks in some edge cases. Result: This results in a reduced chance of slower remediation.
Release Note Type:
Enhancement
Release Note Status:
Proposed
Intelligence Requested:
Market:

Target Version:

rhwa-4.21-0

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

We configured NHC, FAR and SNR to use topologySpreadConstraints for spreading replicas across nodes. This might introduce an issue with updates in some corner cases, see comment on the SNR PR: https://github.com/medik8s/self-node-remediation/pull/180#discussion_r2419792014

Investigate if this is a real issue, and update all 3 operators if needed.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

rhwa 366.html
2 kB
2026/02/18 10:11 PM

is caused by

RHWA-363 SNR: use 2 replicas

Closed

RHWA-364 [FAR] Improve HA by using 'topologySpreadConstraints' to enforce strict pod distribution for fence-agents-controller-manager replicas

Closed

RHWA-365 [NHC] Improve HA by using 'topologySpreadConstraints' to enforce strict pod distribution for node-healthcheck-controller-manager replicas

Closed

links to

medik8s/fence-agents-remediation#187: Change Deployment UpdateStrategy

medik8s/node-healthcheck-operator#384: Change Deployment UpdateStrategy

medik8s/self-node-remediation#273: Change Deployment UpdateStrategy

mentioned on

Merge request - TELCODOCS-2597: RHWA 4.21-0 Release Notes first draft / Common Attributes...

(1 links to, 1 mentioned on)

Assignee:: Marc Sluiter

Reporter:: Marc Sluiter

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2025/10/16 4:38 PM

Updated:: 2026/02/25 10:21 AM

Resolved:: 2026/01/12 4:06 PM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty