Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-29861

The "pcmk_monitor_timeout" default value in multiple documentation is listed as 60s, but should be 20s

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • rhel-10.1
    • rhel-9.4
    • pacemaker
    • None
    • None
    • None
    • rhel-sst-high-availability
    • ssg_filesystems_storage_and_HA
    • 3
    • Dev ack
    • False
    • Hide

      None

      Show
      None
    • Yes
    • None
    • None
    • None
    • Bug Fix
    • Hide
      Cause (the user action or circumstances that trigger the bug):
      Consequence (what the user experience is when the bug occurs):
      Fix (what has changed to fix the bug; do not include overly technical details):
      Result (what happens now that the patch is applied):
      Show
      Cause (the user action or circumstances that trigger the bug): Consequence (what the user experience is when the bug occurs): Fix (what has changed to fix the bug; do not include overly technical details): Result (what happens now that the patch is applied):
    • Proposed
    • All
    • None

      We have identified that the `pcmk_monitor_timeout` default value for stonith devices reports a default which is not accurate in all of our documentation, and man pages. The default is listed as 60s ( based on `stonith-timeout`, but since `pcmk_monitor_timeout` isn't actually applied unless explicitly set, this value would not be very accurate. The actual monitor timeout by default would be 20s, so we should update this in documentation and man pages ( upstream and in RHEL ):

      https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/s1-fencedevicesadditional-haar
      https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configu[...]-fencing-configuring-and-managing-high-availability-clusters
      https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/configu[...]-fencing-configuring-and-managing-high-availability-clusters

       

      $ man pacemaker-fenced
      --------------------------------8<----------------------------- 
      pcmk_monitor_timeout = time [60s]
          Advanced use only: Specify an alternate timeout to use for monitor
          actions instead of stonith-timeout
          Some devices need much more/less time to complete than normal.
          Use this to specify an alternate, device-specific, timeout 
          for 'monitor' actions.
      
      
      

       

       

      Discussion in Slack around issue:

      https://redhat-internal.slack.com/archives/C04HH4AJYH4/p1710789736264799

       

      After discussion with Kgalliot and engineering, below are the tasks we wish to complete with this bug:

      (1) figure out how the fencing monitor timeouts currently work

       

      (2) decide and implement how they should be defined and used

       

      (3) update the upstream documentation appropriately.  They are also in the pacemaker-fenced man page, which would need updates as well.

       

      (4) update the RHEL documentation. 

      • For official documentation updates, we have the below DOC request opened:

          [RHELDOCS-17816] Update documentation for pcmk_monitor_timeout
          https://issues.redhat.com/browse/RHELDOCS-17816

      This issue would additionally be an extension of issues being reviewed in below BUG:

          RHEL-14826 A stop action for a stonith device timed out leading to a cluster node
          being fenced
          https://issues.redhat.com/browse/RHEL-14826

       

              rhn-support-nwahl Reid Wahl
              rhn-support-jobaker Joshua Baker
              Kenneth Gaillot Kenneth Gaillot
              Jana Rehova Jana Rehova
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: