Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-1890

As a cloud operator I would like to roll-back to the old 17.1 based control plane if the adoption process fails

    • As a cloud operator I would like to roll-back to the old 17.1 based control plane if the adoption process fails
    • False
    • Committed
    • Committed
    • Done
    • RHOSSTRAT-204 - Red Hat OpenStack 18.0 Data Plane Adoption
    • Committed
    • Committed
    • 0% To Do, 0% In Progress, 100% Done
    • Release Note Not Required
    • Proposed
    • Approved

      Epic Overview
      The primary focus is on implementing a reliable roll-back mechanism in case of adoption failures.

      Goals
      The epic aims to benefit operators and system administrators by providing a seamless and secure upgrade process. The roll-back mechanism ensures minimal downtime and operational disruptions in case of adoption failures.

      Requirements
      A list of specific needs or objectives that a Feature must deliver to satisfy the Feature.. Some requirements will be flagged as MVP. If an MVP gets shifted, the feature shifts. If a non MVP requirement slips, it does not shift the feature.

      Requirements Notes isMvp?
      Implement a rollback mechanism triggered by the human operator   Yes
      Describe how to start the services in the 17.1 Control node that were stopped in the Adoption process   Yes
      Revert back compute nodes to a supported state (everything is running as it was)   Yes
           

      Out of Scope
      Any mechanism that triggers automatic rollback based on failure conditions

      Assumptions

      • RHOSP 17.1 control plane runs in parallel during the RHOSO 18 Adoption procedure.
      • If adoption fails during the Control plane Adoption then roll back of the control plane should be executed.
      • We do not do any changes to the source control plane DBs during the Adoption procedure (meaning to the original MariaDB / OVN DBs running on the original controllers).
      • We should be able to just start the original OpenStack control plane services to restore the original cloud. And we can delete the podified control plane content from OpenShift without affecting the original control plane or data plane.

      Documentation Considerations
      The delivery is a documentation steps describing how to revert back to an stable RHOSP 17.1 control plane

      Interoperability Considerations

      Questions

      Question Outcome
      What happens if some computes nodes are adopted during the Adoption? Can we support OSP 17.1 compute nodes and RHOSO 18 nodes already adopted managed by the RHOSO control plane? Once the data plane hosts are touched, they are beyond the point of no return and the only option is to keep going forward.
         

      Proposed high-level procedure, after OSPRH-2301 and OSPRH-1490 are implemented

      • Perform control plane adoption according to docs. Includes copying MariaDB and OVN DB contents from source control plane to podified. There are no edits done directly on the source databases. Any DB edits are performed only on the podified side without affecting the source side.
      • Assess the configuration using OS Diff, tweak the control plane configuration as necessary.
      • Point of no return: If at any point during the prior steps it is concluded that a rollback should be performed, here is the last opportunity to perform the rollback. (Rollback == start original control plane services and delete the podified resources.)
      • Adoption the data plane via EDPM. After the data plane is touched by EDPM, we consider the rollback to no longer be possible.

            [OSPRH-1890] As a cloud operator I would like to roll-back to the old 17.1 based control plane if the adoption process fails

            Hi arcsingh@redhat.com I see Test Coverage proposed however no story for automation linked, can you please link the TechDebt Epic with the relevant story, or if this is planned to be finalized within OSP18 GA, move this Epic back to Verified before automation is finalized + link story?

            Lukas Svaty added a comment - Hi arcsingh@redhat.com I see Test Coverage proposed however no story for automation linked, can you please link the TechDebt Epic with the relevant story, or if this is planned to be finalized within OSP18 GA, move this Epic back to Verified before automation is finalized + link story?

            If you think customers need a description of this issue in addition to the content of the Jira summary field, please set the 'Release Note Type' and provide draft text in the 'Release Note Text' field. The documentation team will review, edit, and approve the text.

            If the Jira issue already has 'Release Note Text', please perform a quick review of the text and update the 'Release Note Type' value or text if needed. For example, it was a Known Issue in the previous release but now it is a Bug Fix.

            When adding release notes, please consider the following guidelines:

            GA-only: Many of the enhancements introduced in RHOSO 18.0 GA will be described in the Top New and Enhanced Features section of the Release Notes. If this Jira issue tracks an enhancement that is likely to be described as a top new feature, you probably don’t need to write a `Release Note Text` for this issue. If you are unsure, consult with your technical writer.

            If the Jira issue is a bug fix that will be fixed in a future release but the bug affects the current release, ensure that the 'Release Note Type' is set to 'Known Issue' and that the Release Note Text describes the problem, not the planned fix.

            If this issue does not require a 'Release Note Text' description, please set 'Release Note Type' to 'Release Note Not Required'.

            For more information on how to set the release note fields and write release notes, see Release notes text process [1].
            [1] https://spaces.redhat.com/pages/viewpage.action?spaceKey=RHOSPDOC&title=How-to-Jira+with+the+RHOS+docs+team#HowtoJirawiththeRHOSdocsteam-Releasenotestextpr

            James Smith added a comment - If you think customers need a description of this issue in addition to the content of the Jira summary field, please set the 'Release Note Type' and provide draft text in the 'Release Note Text' field. The documentation team will review, edit, and approve the text. If the Jira issue already has 'Release Note Text', please perform a quick review of the text and update the 'Release Note Type' value or text if needed. For example, it was a Known Issue in the previous release but now it is a Bug Fix. When adding release notes, please consider the following guidelines: GA-only: Many of the enhancements introduced in RHOSO 18.0 GA will be described in the Top New and Enhanced Features section of the Release Notes. If this Jira issue tracks an enhancement that is likely to be described as a top new feature, you probably don’t need to write a `Release Note Text` for this issue. If you are unsure, consult with your technical writer. If the Jira issue is a bug fix that will be fixed in a future release but the bug affects the current release, ensure that the 'Release Note Type' is set to 'Known Issue' and that the Release Note Text describes the problem, not the planned fix. If this issue does not require a 'Release Note Text' description, please set 'Release Note Type' to 'Release Note Not Required'. For more information on how to set the release note fields and write release notes, see Release notes text process [1] . [1] https://spaces.redhat.com/pages/viewpage.action?spaceKey=RHOSPDOC&title=How-to-Jira+with+the+RHOS+docs+team#HowtoJirawiththeRHOSdocsteam-Releasenotestextpr

            Since this is verified, i'll mark it closed to drop it from our "to be finished" queries.

            Jiri Stransky added a comment - Since this is verified, i'll mark it closed to drop it from our "to be finished" queries.

             The test automation PRs are tested on standalone and non-standalone env.

            https://github.com/openstack-k8s-operators/data-plane-adoption/pull/372

            Archana Singh added a comment -  The test automation PRs are tested on standalone and non-standalone env. https://github.com/openstack-k8s-operators/data-plane-adoption/pull/372

            Jiri Stransky added a comment - PRs with the test suite additions have been merged: https://github.com/openstack-k8s-operators/data-plane-adoption/pull/372 https://github.com/openstack-k8s-operators/data-plane-adoption/pull/394

            Jiri Stransky added a comment - Test suite pull request: https://github.com/openstack-k8s-operators/data-plane-adoption/pull/372

            The pull request with rollback docs is merged, i'll mark this dev complete.

            Jiri Stransky added a comment - The pull request with rollback docs is merged, i'll mark this dev complete.

            Jiri Stransky added a comment - Docs pull request posted: https://github.com/openstack-k8s-operators/data-plane-adoption/pull/311

            Will have a pull request later today or tomorrow. Still hoping this will be dev complete this week.

            Jiri Stransky added a comment - Will have a pull request later today or tomorrow. Still hoping this will be dev complete this week.

            Ehud Shkalim added a comment - arcsingh@redhat.com jstransk@redhat.com  lgtm

              jstransk@redhat.com Jiri Stransky
              pnavarro@redhat.com Pedro Navarro Perez
              Archana Singh Archana Singh
              rhos-dfg-upgrades
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: