Uploaded image for project: 'OpenStack as Infra'
  1. OpenStack as Infra
  2. OSASINFRA-2262

Investigate using Nova multi-create to fix racey scheduling

XMLWordPrintable

    • 5
    • False
    • False
    • Undefined

      Affinity policies might not be correctly applied by Nova when servers are created in parallel (Compute docs).

      There are two possible solutions:
      A) wait for server N to be ACTIVE before triggering the creation of server N+1
      B) Create multiple servers in a single batch using Nova's "multi-create" call

      Solution A) has been applied to Terraform to ensure that the hardcoded "soft-anti-affinity" policy is enforced to the first three instances (bug) (patch)

      However, this solution:

      • only applies to the first three master nodes (if replicas is >3, masters beyond the first three are created by CAPO in parallel)
      • does not apply to workers (customers can define affinity policies at install or day-2 ops)
      • is not compatible with the planned single-node deployment feature.

      This card is about applying solution B) to Terraform and to CAPO's MachineSet scaling.

      In the context of Terraform, this solution will re-enable creating clusters with less than 3 master nodes.

      In the context of CAPO, this solution is expected to work around affinity issues during worker scale-out. Moreover, it will apply to master scale-out when/if OpenShift starts using MachineSets for Control plane nodes.

      To be investigated: from the API documentation:
      Error handling for multiple create is not as consistent as for single server create, and there is no guarantee that all the servers will be built. This call should generally be avoided in favor of clients doing direct individual server creates.

            Unassigned Unassigned
            pprinett@redhat.com Pierre Prinetti
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: