Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-1890

As a cloud operator I would like to roll-back to the old 17.1 based control plane if the adoption process fails

XMLWordPrintable

    • As a cloud operator I would like to roll-back to the old 17.1 based control plane if the adoption process fails
    • False
    • Committed
    • Committed
    • To Do
    • RHOSSTRAT-204 - Red Hat OpenStack 18.0 Data Plane Adoption
    • Committed
    • Committed
    • 0% To Do, 0% In Progress, 100% Done
    • Release Note Not Required
    • Proposed
    • Approved

      Epic Overview
      The primary focus is on implementing a reliable roll-back mechanism in case of adoption failures.

      Goals
      The epic aims to benefit operators and system administrators by providing a seamless and secure upgrade process. The roll-back mechanism ensures minimal downtime and operational disruptions in case of adoption failures.

      Requirements
      A list of specific needs or objectives that a Feature must deliver to satisfy the Feature.. Some requirements will be flagged as MVP. If an MVP gets shifted, the feature shifts. If a non MVP requirement slips, it does not shift the feature.

      Requirements Notes isMvp?
      Implement a rollback mechanism triggered by the human operator   Yes
      Describe how to start the services in the 17.1 Control node that were stopped in the Adoption process   Yes
      Revert back compute nodes to a supported state (everything is running as it was)   Yes
           

      Out of Scope
      Any mechanism that triggers automatic rollback based on failure conditions

      Assumptions

      • RHOSP 17.1 control plane runs in parallel during the RHOSO 18 Adoption procedure.
      • If adoption fails during the Control plane Adoption then roll back of the control plane should be executed.
      • We do not do any changes to the source control plane DBs during the Adoption procedure (meaning to the original MariaDB / OVN DBs running on the original controllers).
      • We should be able to just start the original OpenStack control plane services to restore the original cloud. And we can delete the podified control plane content from OpenShift without affecting the original control plane or data plane.

      Documentation Considerations
      The delivery is a documentation steps describing how to revert back to an stable RHOSP 17.1 control plane

      Interoperability Considerations

      Questions

      Question Outcome
      What happens if some computes nodes are adopted during the Adoption? Can we support OSP 17.1 compute nodes and RHOSO 18 nodes already adopted managed by the RHOSO control plane? Once the data plane hosts are touched, they are beyond the point of no return and the only option is to keep going forward.
         

      Proposed high-level procedure, after OSPRH-2301 and OSPRH-1490 are implemented

      • Perform control plane adoption according to docs. Includes copying MariaDB and OVN DB contents from source control plane to podified. There are no edits done directly on the source databases. Any DB edits are performed only on the podified side without affecting the source side.
      • Assess the configuration using OS Diff, tweak the control plane configuration as necessary.
      • Point of no return: If at any point during the prior steps it is concluded that a rollback should be performed, here is the last opportunity to perform the rollback. (Rollback == start original control plane services and delete the podified resources.)
      • Adoption the data plane via EDPM. After the data plane is touched by EDPM, we consider the rollback to no longer be possible.

              jstransk@redhat.com Jiri Stransky
              pnavarro@redhat.com Pedro Navarro Perez
              Archana Singh Archana Singh
              rhos-dfg-upgrades
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: