Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-59955

Race mirroring CAPI->MAPI machines when creating MAPI MachineSet

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The bug can be observed in this test flake:

      https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_cluster-capi-operator/339/pull-ci-openshift-cluster-capi-operator-main-unit/1950458809658904576

      The relevant extract from this flake is copied below.

      In the test, all the CAPI resources exist, but none of the MAPI resources exist. Finally we create a MAPI MachineSet corresponding to the CAPI MachineSet, which should cause the CAPI Machines to be mirrored to MAPI.

      However, this races in the test, and we should expect it to always fail in production. The reason is that the machine sync controller does not watch MAPI MachineSet, so if the machines are fully reconciled before the MachineSet is created they will not be reconciled again when the MachineSet is created. Because the test creates the objects in quick succession, this apparently often succeeds in the test suite.

      With a running MachineSync Reconciler when all the CAPI infra resources exist when the MAPI machine does not exist and the CAPI machine does And there is a CAPI Machineset owning the machine with a MAPI counterpart should create a MAPI machine                                                           /go/src/github.com/openshift/cluster-capi-operator/pkg/controllers/machinesync/machine_sync_controller_test.go:531                                                                                                                                                                                              STEP: Setting up a namespaces for the test @ 07/30/25 07:55:48.525                                                                                                                                                                                                                                            STEP: Setting up a manager and controller @ 07/30/25 07:55:48.536                                                                                                                                                                                                                                             STEP: Starting the manager @ 07/30/25 07:55:48.536                                                                                                                                                                                                                                                            STEP: Creating the CAPI infra machine @ 07/30/25 07:55:48.537                                                                                                                                                                                                                                               I0730 07:55:48.537086   27952 server.go:208] "Starting metrics server" logger="controller-runtime.metrics"                                                                                                                                                                                                    I0730 07:55:48.537276   27952 server.go:247] "Serving metrics server" logger="controller-runtime.metrics" bindAddress=":8080" secure=false                                                                                                                                                                    I0730 07:55:48.537303   27952 controller.go:175] "Starting EventSource" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" source="kind source: *v1beta1.Machine"                                                                                             I0730 07:55:48.537362   27952 controller.go:175] "Starting EventSource" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" source="kind source: *v1beta1.Machine"                                                                                             I0730 07:55:48.537418   27952 controller.go:175] "Starting EventSource" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" source="kind source: *v1beta2.AWSMachine"                                                                                          I0730 07:55:48.537446   27952 controller.go:183] "Starting Controller" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine"                                                                                                                                       STEP: Creating the CAPI machineset @ 07/30/25 07:55:48.539                                                                                                                                                                                                                                                    STEP: Creating the CAPI machine @ 07/30/25 07:55:48.542                                                                                                                                                                                                                                                     I0730 07:55:48.643663   27952 controller.go:217] "Starting workers" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" worker count=1                                                                                                                         I0730 07:55:48.643823   27952 machine_sync_controller.go:193] "MAPI Machine not found" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api-djqvz/machine-template" namespace="openshift-machine-api-djqvz" name="machine-template" reconcileID="50c58adb-3edf-4681-b354-671f7123c6be" namespace="openshift-machine-api-djqvz" name="machine-template"                                                                                                                                                                                         I0730 07:55:48.643920   27952 machine_sync_controller.go:210] "Cluster API Machine not found" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api-djqvz/machine-template" namespace="openshift-machine-api-djqvz" name="machine-template" reconcileID="50c58adb-3edf-4681-b354-671f7123c6be" namespace="openshift-machine-api-djqvz" name="machine-template"                                                                                                                                                                                  I0730 07:55:48.643971   27952 machine_sync_controller.go:220] "Cluster API and Machine API machines not found, nothing to do" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api-djqvz/machine-template" namespace="openshift-machine-api-djqvz" name="machine-template" reconcileID="50c58adb-3edf-4681-b354-671f7123c6be" namespace="openshift-machine-api-djqvz" name="machine-template"                                                                                                                                                  I0730 07:55:48.644080   27952 machine_sync_controller.go:193] "MAPI Machine not found" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api-djqvz/foo" namespace="openshift-machine-api-djqvz" name="foo" reconcileID="619ab15b-631b-48ef-a9f9-54f74f54f319" namespace="openshift-machine-api-djqvz" name="foo"                                                                                                                                                                                                                                I0730 07:55:48.853278   27952 machine_sync_controller.go:193] "MAPI Machine not found" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api-djqvz/foo" namespace="openshift-machine-api-djqvz" name="foo" reconcileID="fcfc23ec-2I0730 07:55:48.856243   27952 machine_sync_controller.go:193] "MAPI Machine not found" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api-djqvz/machine-template" namespace="openshift-machine-api-djqvz" name="machine-template" reconcileID="daba6bd0-1671-49fc-a788-e70021a71a84" namespace="openshift-machine-api-djqvz" name="machine-template"
      I0730 07:55:48.856323   27952 machine_sync_controller.go:210] "Cluster API Machine not found" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api-djqvz/machine-template" namespace="openshift-machine-api-djqvz" name="machine-template" reconcileID="daba6bd0-1671-49fc-a788-e70021a71a84" namespace="openshift-machine-api-djqvz" name="machine-template"
      I0730 07:55:48.856369   27952 machine_sync_controller.go:220] "Cluster API and Machine API machines not found, nothing to do" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api-djqvz/machine-template" namespace="openshift-machine-api-djqvz" name="machine-template" reconcileID="daba6bd0-1671-49fc-a788-e70021a71a84" namespace="openshift-machine-api-djqvz" name="machine-template"
        [FAILED] in [It] - /go/src/github.com/openshift/cluster-capi-operator/pkg/controllers/machinesync/machine_sync_controller_test.go:532 @ 07/30/25 07:55:58.637
        STEP: Stopping the manager @ 07/30/25 07:55:58.637
      I0730 07:55:58.637637   27952 internal.go:538] "Stopping and waiting for non leader election runnables"
      I0730 07:55:58.637685   27952 internal.go:542] "Stopping and waiting for leader election runnables"
      I0730 07:55:58.637722   27952 controller.go:237] "Shutdown signal received, waiting for all workers to finish" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine"
      I0730 07:55:58.637809   27952 controller.go:239] "All workers finished" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine"
      I0730 07:55:58.637897   27952 internal.go:550] "Stopping and waiting for caches"
      I0730 07:55:58.638199   27952 internal.go:554] "Stopping and waiting for webhooks"
      I0730 07:55:58.638259   27952 internal.go:557] "Stopping and waiting for HTTP servers"
      I0730 07:55:58.638326   27952 server.go:254] "Shutting down metrics server with timeout of 1 minute" logger="controller-runtime.metrics"
      I0730 07:55:58.638476   27952 internal.go:561] "Wait completed, proceeding to shutdown the manager"
        STEP: Cleaning up MAPI test resources @ 07/30/25 07:55:58.638
      • [FAILED] [10.154 seconds]
      With a running MachineSync Reconciler when all the CAPI infra resources exist when the MAPI machine does not exist and the CAPI machine does And there is a CAPI Machineset owning the machine with a MAPI counterpart [It] should create a MAPI machine
      /go/src/github.com/openshift/cluster-capi-operator/pkg/controllers/machinesync/machine_sync_controller_test.go:531
      
        [FAILED] Timed out after 10.001s.
        Value for field 'Items' failed to satisfy matcher.
        Expected
            <[]v1beta1.Machine | len:0, cap:0>: []
        to contain element matching
            <*matchers.HaveFieldMatcher | 0xc001402e00>: {
                Field: "ObjectMeta.Name",
                Expected: <*matchers.EqualMatcher | 0xc000b16600>{
                    Expected: <string>"foo",
                },
            }
        In [It] at: /go/src/github.com/openshift/cluster-capi-operator/pkg/controllers/machinesync/machine_sync_controller_test.go:532 @ 07/30/25 07:55:58.637
      
        Full Stack Trace
          github.com/openshift/cluster-capi-operator/pkg/controllers/machinesync.init.func1.5.7.2.3.2()
              /go/src/github.com/openshift/cluster-capi-operator/pkg/controllers/machinesync/machine_sync_controller_test.go:532 +0x474
      

              ddonati@redhat.com Damiano Donati
              rhn-gps-mbooth Matthew Booth
              None
              None
              Zhaohua Sun Zhaohua Sun
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: