Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-5665

As a cloud operator, I want to validate parts of NodeSet config, before the point of no return (EDPM adoption)

XMLWordPrintable

    • As a cloud operator, I want to validate osdpnses configuration, before the point of no return (EDPM adoption)
    • 14
    • False
    • Hide

      None

      Show
      None
    • False
    • OSPRH-813Red Hat OpenStack 18.0 Data Plane Adoption
    • Committed
    • Committed
    • To Do
    • OSPRH-813 - Red Hat OpenStack 18.0 Data Plane Adoption
    • openstack-ansible-ee-container-1.0.0-14
    • Committed
    • Committed
    • 0% To Do, 0% In Progress, 100% Done
    • Release Note Not Required
    • Hide

      RHOSO18Beta waived:Upgrade: Adoption

      Show
      RHOSO18Beta waived: Upgrade : Adoption
    • Automated
    • 2023Q4, 2024Q1, 2024Q2
    • Approved

      Whenever EDPM os-reboot role reboots the node, that disrupts workloads during adoption. We need to make this role smarter, like,
      if it detects some change in the reboot-worthy parameters (osdpns ansible vars),
      it might have some "safety switch" that would switch between:

      • if changes detected -> reboot
      • if changes detected -> fail (early, not after the point of no return) with a message

      The latter point is where we need this new validations step to come into play, before

      the "real deployment" happens (which is a point of no return for EDPM).

      Validations should do the following:

      • compare tripleo wallaby (OSP17.1) specific configuration to not cause conflicts (hostname FQDN/shortname), lost features (derived parameters), other changed cloud operational modes (stable uuids), with the configured ansible vars in osdpns(es);
      • also detect config changes that require rebooting a node (like kernelags changes on its way from tripleo to rhoso);
      • give end users chance to adjust osdnps and re-iterate

      Unfortunatly, tools like os-diff, or tripleo validations framework unlikely to help with that.

      DoD proposal after discussions:

      • Implement a role in edpm-ansible to perform validation that important settings are not changing during Adoption.
        • Use this to validate service hostnames
        • Use this to validate other settings: kernel args, tuned profiles
      • Add a validation step that runs the validation role into the Adoption docs and tests (a separate EDPM Deployment with just a single service for the validation). This comes before the point of no return from rollback perspective, last thing before we start adoption of the data plane nodes.
      • Tweak os-reboot role so that:
        • The reboot prevention detection is smarter for Adoption nodes, avoids rebooting any nodes during Adoption.
        • The reboot can be explicitly disabled through a variable.

      Acceptance criteria:

      • The epdm-ansible validation role has been properly added. 
      • The validation role has been added in the adoption test-suite downstream CI before EDPM Adoption.
      • Verify os-reboot role will not reboot any node during adoption. [ explicit variable to control enable/disable rebooting]
      • The documented procedure is verified with standalone and HA OSP env. 

              mciecier@redhat.com Mikolaj Ciecierski
              bdobreli@redhat.com Bohdan Dobrelia
              Archana Singh Archana Singh
              rhos-dfg-upgrades
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: