Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-87524

On RHEL 9.4, Pacemaker controld tries to connect to subdaemon before it gets respawned

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Critical Critical
    • None
    • rhel-9.4
    • pacemaker
    • No
    • None
    • rhel-ha
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • All
    • None

      What were you trying to do that didn't work?

      When one of the pacemaker sub daemons hangs ( in this case, pacemaker-attrd ), the Pacemaker tries five times to connect to the process, kills it, and respawns it. The problem we encountered is that there is a small timing hole between killing a process and respawning. If the pacemaker-controld tries to connect to a sub daemon that was killed and is in the process of respawning, the controld fails to connect to the daemon and takes that as a fatal error and shuts down the entire pacemaker stack.

      What is the impact of this issue to you?

      • Pacemaker encountered fatal error and shuts itself down and does not recover without manual intervention

        Please provide the package NVR for which the bug is seen:

      version 2.1.8-3.el9-3980678f0

      How reproducible is this bug?:

      difficult to reproduce, as it requires Pacemaker controld to interact with the killed sub-daemon before it respawns

      Steps to reproduce

      1. Run kill -SIGSTOP one of the Pacemaker sub daemon in this example pacemaker-attrd
      2. Pacemaker logs attrd is unresponsive to ipc and respawns attrd
      3. After the attrd is killed and before it respawns controld connects to the attrd ( to update failure count etc, )
      4. pacemaker controld fails to connect to attrd daemon and shuts down the entire Pacemaker stack

      Expected results

      The pacemaker will not try to interact with sub daemon that it just killed and in the process of respawning

      Actual results

      Pacemaker interacts with sub daemon it just killed and thus entire Pacemaker stack goes down

              rhn-support-clumens Christopher Lumens
              donghohan@ibm.com Dongho Han
              Chris Feist, Christopher Lumens, Dongho Han
              Christopher Lumens Christopher Lumens
              Cluster QE Cluster QE
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: