Loading...

Linking RHIVOS CVEs to...

Migration: Automation ...

Sync from "Extern...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: rhel-10.0
Component/s: pacemaker
Labels:
- auto-close-warning

Regression:
No
Severity:
None
AssignedTeam:
rhel-ha

Story Points:
8
Blocked:
False
Ready:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
None
Sprint:
None

Preliminary Testing:
None
Test Coverage:
None

Experience:
Architecture:

x86_64

PX Impact Score:
SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

Please provide the package NVR for which the bug is seen:

pacemaker-3.0.0-5.el10.x86_64
pcs-0.12.0-2.el10.x86_64

How reproducible is this bug?:

always, easily

Steps to reproduce

Create dummy resource with op_sleep longer than op monitor interval, so it fails:

[root@virt-246 ~]# pcs resource create dummy1 ocf:pacemaker:Dummy op_sleep=15 op monitor interval=10 timeout=15

Create a location constraint on that resource:

[root@virt-246 ~]# pcs constraint location dummy1 prefers virt-245=INFINITY

Wait until resource fails and then disable it:

[root@virt-246 ~]# pcs resource disable dummy1

[root@virt-246 ~]# pcs status --full
Cluster name: STSRHTS2031
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: virt-245 (1) (version 3.0.0-5.el10-5b53b7e) - partition with quorum
  * Last updated: Fri Feb  7 15:54:28 2025 on virt-246
  * Last change:  Fri Feb  7 15:54:09 2025 by root via root on virt-246
  * 2 nodes configured
  * 3 resource instances configured (1 DISABLED)

Node List:
  * Node virt-245 (1): online, feature set 3.20.0
  * Node virt-246 (2): online, feature set 3.20.0

Full List of Resources:
  * fence-virt-245	(stonith:fence_xvm):	 Started virt-245
  * fence-virt-246	(stonith:fence_xvm):	 Started virt-246
  * dummy1	(ocf:pacemaker:Dummy):	 FAILED virt-245 (disabled)

Migration Summary:
  * Node: virt-245 (1):
    * dummy1: migration-threshold=1000000 fail-count=2 last-failure='Fri Feb  7 15:54:14 2025'

Failed Resource Actions:
  * dummy1_monitor_10000 on virt-245 'Error occurred' (1): call=21, status='Timed out', exitreason='Resource agent did not complete within 15s', last-rc-change='Fri Feb  7 15:54:14 2025', queued=0ms, exec=14847ms

Tickets:

PCSD Status:
  virt-245: Online
  virt-246: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Delete the resource:

[root@virt-246 ~]# pcs resource delete dummy1
Removing dependant element:
  Location constraint: 'location-dummy1-virt-245-INFINITY'
Stopping resource 'dummy1' before deleting
Waiting for the cluster to apply configuration changes...

[root@virt-245 ~]# crm_resource --wait -T 1
Pending actions:
crm_resource: Error performing operation: Timeout occurred

Expected results

Resource is deleted.

Actual results

Resource is not deleted, cluster is stuck while deleting the resource.

Additional info:

If I run `pcs resource refresh` on that disabled resource before deleting it, it is deleted after a few seconds and cluster does not get stuck.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

cib.xml
12 kB
2025/02/11 8:05 PM
crm_resource.log
190 kB
2025/02/11 8:05 PM
pacemaker.log
114 kB
2025/02/11 8:05 PM

is cloned by

RHEL-153661 [RHEL9] Cluster gets stuck, when deleting failed and disabled resource with a constraint.

links to

ClusterLabs T983

Assignee:: Christopher Lumens

Reporter:: Marketa Smazova

Developer:: Christopher Lumens

QA Contact:: Cluster QE

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2025/02/07 3:22 PM

Updated:: 2026/03/05 1:35 PM

Stale Date:: 2026/04/03

Details

Description

Please provide the package NVR for which the bug is seen:

How reproducible is this bug?:

Steps to reproduce

Expected results

Actual results

Additional info:

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates