Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-62722

Pacemaker is unable to run with CIB containing an ACL role referencing a location constraint with several rules

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • rhel-10.0
    • rhel-10.0
    • pacemaker
    • None
    • Yes
    • Important
    • rhel-sst-high-availability
    • ssg_filesystems_storage_and_HA
    • 17
    • 20
    • 5
    • Dev ack
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Release Note Not Required
    • This issue was not in a released build
    • Proposed
    • None

      What were you trying to do that didn't work?

      I run a pacemaker-2 cluster. I do have a location constraint with two rules. The constraint is referenced in an ACL role. After upgrade to pacemaker-3, pacemaker no longer works. 'pcs cluster start' returns success, however 'pcs status' prints an error:

        Cannot upgrade configuration (claiming pacemaker-3.10 schema) to at least pacemaker-4.0 because it does not validate with any schema from pacemaker-3.10 to the latest
        Upgrade failed: Schema transform failed
        Error outputting status info from the fencer or CIB

      What is the impact of this issue to you?

      I haven't actually test cluster upgrade from RHEL 9 to RHEL 10. But it looks like after the upgrade, customers could end up with a non-functioning cluster. If this is really the case, then it should be prevented.

      Please provide the package NVR for which the bug is seen:

      pacemaker-2.1.8-39.cfd45a819f.git.el10.x86_64

      How reproducible is this bug?:

      always, easily

      Steps to reproduce

      Configure a CIB with an ACL role referencing a location constraint with two rules:

      <constraints>
        <rsc_location id="location-d3" rsc="d3">
          <rule id="location-d3-rule" boolean-op="and" score="INFINITY">
            <date_expression id="location-d3-rule-expr" operation="gt" start="2021-01-01"/>
          </rule>
          <rule id="location-d3-rule-1" boolean-op="and" score="INFINITY">
            <date_expression id="location-d3-rule-1-expr" operation="gt" start="2022-01-01"/>
          </rule>
        </rsc_location>
      </constraints>
      <acls>
        <acl_role id="test">
          <acl_permission id="test-deny" kind="deny" reference="location-d3"/>
        </acl_role>
      </acls>

      Upgrade from pacemaker 2 to pacemaker 3.

      Expected results

      I see two options (there may be more):

      • Pacemaker modifies the CIB better so it can start with it
      • Upgrade from RHEL 9 to RHEL 10 is prevented with an explanatory error message

      Actual results

      This is caused by pacemaker dropping support for multiple rules in a location constraint. There is a transformation in pacemaker which modifies the CIB so that it matches new CIB schema. However, that transformation modifies IDs of affected location constraints. If those constraints' IDs are referenced in ACLs, the CIB is not valid. In such case, pacemaker logs contain following messages:

      pacemaker-schedulerd[24945] (xml_log)   error: IDREF attribute reference references an unknown ID "location-d3"
      pacemaker-schedulerd[24945] (apply_upgrade)     error: Schema upgrade from pacemaker-3.10 to pacemaker-4.0 failed: XSL transform pipeline produced an invalid configuration
      pacemaker-schedulerd[24945] (xml_log)   error: Element rsc_location has extra content: rule
      pacemaker-schedulerd[24945] (xml_log)   error: Element constraints has extra content: rsc_location
      pacemaker-schedulerd[24945] (pcmk__update_configured_schema)    error: Cannot upgrade configuration (claiming pacemaker-3.10 schema) to at least pacemaker-4.0 because it does not validate with any schema from pacemaker-3.10 to the latest
      pacemaker-schedulerd[24945] (pcmk__log_transition_summary)      error: Calculated transition 0 (with errors), saving inputs in /var/lib/pacemaker/pengine/pe-error-6.bz2
      pacemaker-schedulerd[24945] (pcmk__log_transition_summary)      notice: Configuration errors found during scheduler processing,  please run "crm_verify -L" to identify issues
      

      'crm_verify -LVV' explains what's wrong:

      (xml_log)       error: IDREF attribute reference references an unknown ID "location-d3"
      (apply_upgrade)         error: Schema upgrade from pacemaker-3.10 to pacemaker-4.0 failed: XSL transform pipeline produced an invalid configuration
      (xml_log)       error: Element rsc_location has extra content: rule
      (xml_log)       error: Element constraints has extra content: rsc_location
      Cannot upgrade configuration (claiming pacemaker-3.10 schema) to at least pacemaker-4.0 because it does not validate with any schema from pacemaker-3.10 to the latest
      The cluster will NOT be able to use this configuration.
      Please manually update the configuration to conform to the pacemaker-4.0 syntax.
      error: CIB did not pass schema validation
      Configuration invalid (with errors)

      Even though this is nicely debugable as pacemaker logs point to the root cause of the issue, it would be better if users didn't get into this situation in the first place.

       

              rhn-support-nwahl Reid Wahl
              tojeline@redhat.com Tomas Jelinek
              Kenneth Gaillot Kenneth Gaillot
              Marketa Smazova Marketa Smazova
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: