-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
rhel-9.2.0
-
None
-
No
-
Low
-
rhel-sst-high-availability
-
ssg_filesystems_storage_and_HA
-
5
-
False
-
-
None
-
None
-
None
-
None
-
None
What were you trying to do that didn't work?
Periodically running crm_resource --refresh --resource <resource_name> and noticed after a certain refresh that the resource failed over to another host in the cluster. Looking at the Pacemaker logs I noticed the following.
Aug 29 13:42:35.400 R9GHADR-srv-2 pacemaker-execd [2447551] (cancel_recurring_action) info: Cancelling ocf operation db2_gerry_gerry_TESTDB_monitor_9000
Aug 29 13:42:35.400 R9GHADR-srv-2 pacemaker-execd [2447551] (services_action_cancel) info: Terminating in-flight op db2_gerry_gerry_TESTDB_monitor_9000[713660] early because it was cancelled
Aug 29 13:42:35.401 R9GHADR-srv-2 pacemaker-execd [2447551] (async_action_complete) info: db2_gerry_gerry_TESTDB_monitor_9000[713660] terminated with signal 9 (Killed)
Aug 29 13:42:35.401 R9GHADR-srv-2 pacemaker-execd [2447551] (cancel_recurring_action) info: Cancelling ocf operation db2_gerry_gerry_TESTDB_monitor_9000
<...>
Aug 29 13:42:35.403 R9GHADR-srv-2 pacemaker-attrd [2447552] (update_attr_on_host) notice: Setting last-failure-db2_gerry_gerry_TESTDB#monitor_9000[R9GHADR-srv-2] in instance_attributes: (unset) -> 1724964155 | from R9GHADR-srv-2 with no write delay
Please provide the package NVR for which bug is seen:
How reproducible: Very easy.
Steps to reproduce
- Do something to make the monitor take a long time, i.e. add sleep 10.
- In my case the migration-threshold for the resource is set to 1, which makes the failure very obvious because it results in takeover.
- Continuously issue crm_resource --refresh --resource <resource_name>until failure is observed.
Expected results
Pacemaker should rerun the cancelled monitor without failing over the resource to another host.
Actual results
Pacemaker will kill the running monitor, and count it as a monitor failure which is problematic if the migration-threshold is set to 1.
- links to