Loading...

Linking RHIVOS CVEs to...

Migration: Automation ...

Sync from "Extern...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: rhel-10.2
Affects Version/s: rhel-9.6
Component/s: resource-agents
Labels:
None

Fixed in Build:
resource-agents-4.16.0-44.el10
Regression:
None
Severity:
Moderate
AssignedTeam:
rhel-ha
Keywords:

OtherQA, ZStream

Dev Target Milestone:
13
Internal Target Milestone:
26
Story Points:
3
Blocked:
False
Ready:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
No
Sprint:
None
Release Blocker:
Regression Exception
Target Backport Versions:

rhel-9.6.z, rhel-9.7.z, rhel-9.8

Git Pull Request:
https://github.com/ClusterLabs/resource-agents/pull/2089, https://github.com/ClusterLabs/resource-agents/pull/2096
Preliminary Testing:
Pass
Test Coverage:

Manual

Release Note Type:
Unspecified Release Note Type - Unknown
ProdDocsReview-CCS:
Unspecified
ProdDocsReview-Dev:
Unspecified
ProdDocsReview-QE:
Unspecified

Experience:

PX Impact Score:
SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

Currently, if the etcd container managed by podman-etcd is abruptly terminated, the monitor operation returns OCF_NOT_RUNNING. This is received by Pacemaker as the resource was never running, which triggers an immediate, local restart of the agent.

This restart is too quick and uncoordinated with the peer node. The agent attempts to rejoin a cluster that hasn't yet recognized the failure,
leading to inconsistent state detection (e.g., seeing 2 active nodes) and causing the start operation to fail or deadlock.

Assignee:: Oyvind Albrigtsen

Reporter:: Carlo Lobrano

Developer:: Oyvind Albrigtsen

QA Contact:: Luca Consalvi

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2025/11/04 10:49 AM

Updated:: 2026/02/16 4:22 PM

Stale Date:: 2027/02/15

Dev Target end:: 2025/11/24

Target end:: 2026/02/23

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates