Uploaded image for project: 'OpenShift Cloud'
  1. OpenShift Cloud
  2. OCPCLOUD-2564

Implement migration controller to handle authority transitions

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • None
    • CLOUD Sprint 262, CLOUD Sprint 268, CLOUD Sprint 269, CLOUD Sprint 270, CLOUD Sprint 271

      Background

      https://github.com/openshift/enhancements/pull/1465 describes a process in which the handover of the authoritative API is transitioned between Machine API and Cluster API.

      This process involves checking for a synchonized condition on the Machine API resource and relying on the synchronized generation and the Paused condition on both Machine API and Cluster API resoucres.

      A control loop should be built to handle just the transitional logic. Keeping the control loop separate will make this logic easy to reason about and easy to test.

      The actual sync of resources will be handled in a separate, isolated loop.

      It will:

      • For each resource that supports the authoritative API
        • Wait for spec.authoritativeAPI to be different to status.authoritativeAPI (user initiated a migration)
        • Set the status.authoritativeAPI to migrating
        • Wait for the previous authoritative API resource to report Paused condition true
        • Check the synchronized condition and synchronizedGeneration fields are up to date
        • Move the status authoritativeAPI to match the spec
      • If the synchronized generation is not up to date, wait until the sync loop updates it

      Behaviours

      • Watch Machines/MachineSets for .spec.authoritativeAPI not equal .status.authoritativeAPI
      • Should set .status.authoritativeAPI to match .spec.authoritativeAPI when .status.authoritativeAPI is empty
        • Exit here
      • Check Synchronized condition on MAPI resource True, exit if not
      • Move status.authoritativeAPI to Migrating
      • If moving away from CAPI, pause the CAPI resource
      • Wait on Paused condition True on old authoritative resource
      • Check Syncrhonized condition and .status.synchronizedGeneration are up to date on MAPI resource, exit if not
        • Error if condition False
        • Requeue later if synchronizedGeneration not up to date
      • Add appropriate Finalizer to new authoritative API
        • Requeue here until we observe this in cache
      • Remove Finalizer from old authoritative API
        • Requeue here until we observe this in cache
      • Move status.authoritativeAPI to match spec.authoritativeAPI and reset status.synchronizedGeneration
      • If moving to CAPI, unpause the CAPI resource

      Steps

      • Create a new control loop in the Cluster-CAPI-Operator repo
      • Implement the logic as described above
      • Add tests to test the transitional behaviour using envtest

      Stakeholders

      • Cluster Infra

      Definition of Done

      • Sync loop for migration is implemented and tested
      • Docs
      • <Add docs requirements for this card>
      • Testing
      • <Explain testing that will be added>

              ddonati@redhat.com Damiano Donati
              joelspeed Joel Speed
              Zhaohua Sun Zhaohua Sun
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: