Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-12515

BZ#2319767 duplicate stonith-fence_compute-fence-nova location constraints

XMLWordPrintable

    • PIDONE 18.0.4, PIDONE 18.0.5, PIDONE 18.0.6, PIDONE 18.0.7, PIDONE 18.0.8
    • 5
    • Low

      Description of problem:
      On large deployments with instanceha, deployment fails with many different errors such as this:
      ~~~
      Oct 17 23:19:42 puppet-user: Error: /Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-compute-0-compute-instanceha-role]: Could not evaluate: pcs -f node attribute overcloud-compute-0 | grep -e ' overcloud-compute-0:.*compute-instanceha-role=true
      ~~~

      and:
      ~~~
      <13>Oct 9 22:41:33 puppet-user: Error: /Stage[main]/Tripleo::Fencing/Pacemaker::Stonith::Fence_ipmilan[00:00:00:00:00:00]/Pcmk_stonith[stonith-fence_ipmilan-0000000000]: Could not evaluate: pcs -f constraint location | grep stonith-fence_ipmilan-000000000 > /dev/null 2>&1 failed: . Too many tries
      ~~~

      Version-Release number of selected component (if applicable):
      17.1.3

      How reproducible:
      Always

      Steps to Reproduce:
      1. Deploy with 198 hosts (3 controllers and 195 instanceha)
      2.
      3.

      Actual results:
      Random failures of deployment

      Expected results:
      No issues

      Additional info:
      Looks like a concurrency bug in pacemaker where when we do more than 4 concurrent cibadmin --query (and maybe --push) , we get a timeout error but we don't appear to fail at that point, we fail later on when we try to use the generated cib.xml file with "-f" and "-f" appears to be empty because #

      {cib}

      is undefined for some reasons (this is speculation at this point beside the concurrent cibadmin which we can reproduce manually).

              dabarzil Daniel Barzilay
              jira-bugzilla-migration RH Bugzilla Integration
              Joe Hakim Rahme Joe Hakim Rahme
              rhos-dfg-pidone
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: