Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-105120

Pacemaker not monitoring a resource for over a minute after restarting it

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Normal Normal
    • None
    • rhel-9.5
    • pacemaker
    • None
    • No
    • None
    • rhel-ha
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • x86_64
    • None

      What were you trying to do that didn't work?

       Soft cascading failure testing ( killing multiple resources at the same time using kill -9 ).

       

      One particular resource was restarted successfully 

      Jul 07 14:21:22.250 svt-ps-efa-1 pacemaker-controld  [3513] (execute_rsc_action)        notice: Initiating start operation db2_member_db2inst1_1_start_0 on svt-ps-efa-4 | action 44

      ...

      Jul 07 14:21:22.476 svt-ps-efa-1 pacemaker-controld  [3513] (process_graph_event)       info: Transition 5405 action 44 (db2_member_db2inst1_1_start_0 on svt-ps-efa-4) confirmed: ok | rc=0

       

      However the monitor for this resource does not get run until 1 minute later

       

      Jul 07 14:22:26.615 svt-ps-efa-1 pacemaker-controld  [3513] (execute_rsc_action)        notice: Initiating monitor operation db2_member_db2inst1_1_monitor_10000 on svt-ps-efa-4 | action 47

      What is the impact of this issue to you?

      The resource does not show the correct state since monitor is not run

      Please provide the package NVR for which the bug is seen: 2.1.9-1

      How reproducible is this bug?: Often

      Steps to reproduce

      1.  Soft cascading failure testing by killing member(db2_member_db2inst1_1) and primaryCF (db2_cf_db2inst1_129) ,ie kill both resources at the same time
      2. monitor the member resource db2_member_db2inst1_1

      Expected results

      monitor gets run right after restarting the member resource

      Actual results

      monitor gets run 1 minute later after restarting the resource

              rhn-support-clumens Christopher Lumens
              tonysmart100 Tony Wang (Inactive)
              Christopher Lumens Christopher Lumens
              Cluster QE Cluster QE
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: