Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-30119

CAPOA controlplane not handling orphaned metal3machines

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • ACM 2.16.0
    • CAPOA
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • Important
    • None

      Description of problem:

      CAPOA controlplane does not handle metal3machine cleanup:

      • when race conditions triggers, and more than necessary machines are being created, the associated metal3machine is created but machine creation fails. This lead to orphhaned metal3machines

        Version-Release number of selected component (if applicable):

      How reproducible:

      This reproduces only when the right race condition is activated: it went undetected with upstream CAPM3 1.11 and 1.12, but it has been triggered with CAPM3 downstream 4.21
      It's not happening 100% of the times, but often enough (around 50% with my setup). When this doesn't happen it seems it's because a lot of worker nodes are being spawned by core CAPI

      Steps to Reproduce:

      1. Provision a cluster with MCE 2.11 (CAPI+CAPOA+CAPM3)
      2. observe "machine storm"
      3. ...

      Actual results:

      • more machines than requested are being spawned

        Expected results:

      • only requested amount of machines being spawned

        Additional info:

              rh-ee-rpiccoli Riccardo Piccoli
              rh-ee-rpiccoli Riccardo Piccoli
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: