Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-21051

[RFE] pcs should not guess expected status of a resource when --wait is used [rhel-9]

    • pcs-0.11.7-3.el9
    • Medium
    • FutureFeature
    • rhel-sst-high-availability
    • ssg_filesystems_storage_and_HA
    • 13
    • 19
    • 13
    • False
    • Hide

      None

      Show
      None
    • Yes
    • None
    • Enhancement
    • Hide
      .`pcs` support for new commands to query the status of a resource in a cluster

      The `pcs` command-line interface now provides `pcs status query resource` commands to query various attributes of a single resource in a cluster. These commands query:

      * the existence of the resource
      * the type of the resource
      * the state of the resource
      * various information about the members of a collective resource
      * on which nodes the resource is running

      You can use these commands for pcs-based scripting since there is no need to parse plain text outputs.
      Show
      .`pcs` support for new commands to query the status of a resource in a cluster The `pcs` command-line interface now provides `pcs status query resource` commands to query various attributes of a single resource in a cluster. These commands query: * the existence of the resource * the type of the resource * the state of the resource * various information about the members of a collective resource * on which nodes the resource is running You can use these commands for pcs-based scripting since there is no need to parse plain text outputs.
    • Done
    • None

      Description of problem:

      When the --wait flag is used in pcs commands, pcs guesses in what state a resource managed by the command should be when the command finishes. At the end of the command, pcs checks in what state the resource really is and returns 0 if real and expected status matches or 1 if the statuses do not match.

      The issue is the expected state of the resource is very hard to get right and that may lead to pcs exiting with a bad return code.

      Version-Release number of selected component (if applicable):

      pcs-0.9.158-4.el7.x86_64

      How reproducible:

      always, easily (depending on cluster settings complexity)

      Steps to Reproduce:

      pcs resource create test1 ocf:pacemaker:Dummy meta is-managed=false --wait
      Error: resource 'test1' is not running on any node
      echo $?
      1
      

      Actual results:

      pcs exits with 1 because the resource did not start

      Expected results:

      pcs exits with 0 as the resource was not able to start (pacemaker does not start unmanaged resources) and therefore the command succeeded

      Additional info:

      With this particular reproducer the issue may seem to be easy to fix in pcs - if the resource is not managed, we expect it not to be started. However more complex setups are possible: the resource may not start due to constraints, utilization, cluster properties and so on and so forth. Also we are not talking about resource create only. Most of the commands supporting --wait are affected.

              rhn-support-nhostako Nina Hostakova
              tojeline@redhat.com Tomas Jelinek
              Peter Romancik Peter Romancik
              Nina Hostakova Nina Hostakova
              Steven Levine Steven Levine
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: