• 0% To Do, 0% In Progress, 100% Done
    • L

      Feature Overview: 

      Ability to reconfigure the core openshift constructs after cluster has been deployed. This should address various scenarios:

      • Pre-installation of OCP as appliance on a pre-staging process and then have the ability to deploy and reconfigure those clusters into a different location
      • Pre-installation of OCP clusters on Cloud provider, keep them in standby until a request for new cluster is received. At that point an existing standby cluster is reconfigured to serve the request without having to do a new installation
      • On disaster recovery scenarios where a cluster is restore from backups on a DR location different from the primary location, the cluster needs to be reconfigured to operate on the new network

       

      This capability should apply to:

      Goals:

      • Architectural design for consistent OCP reconfiguration
      • All OCP core cluster operators should support reconfiguration for the use cases described in this card

      Requirements:

      The SNO relocation requires the ability to modify different configurations in the node and the cluster:

      • Support changing the hostname
      • Support setting DNS server
      • Support changing the cluster name
      • Support changing the cluster domain
      • Support changing the cluster ID
      • Support OCP relocation to a different network (change host IP)
      • Support changing OCP DNS
      • Certificate rotation (should align with) https://issues.redhat.com/browse/OCPSTRAT-714
      • Must maintain an auditable history of reconfigurations
      • Extend the initial kubelet and node cert validity to 30 days (maybe longer)
      • Factory SNO"
        • Minimize deployment time: The deployment time at the far-edge site should be in the order of minutes, ideally less than 20 minutes.
        • Validation before shipment: The solution should allow partners and customers to validate each installed product before shipping it to the far edge, where it is costly to experience errors.
        • Simplify SNO deployment at the far edge: Non-technical operators should be able to reconfigure SNO at deployment time.

      Use Cases:

       

      1. Recovery of OCP clusters on a disaster recovery scenario where the cluster is restored from backups to a DR location where it is not possible to operate using the same identity from the main locations
      2. Ability to pre-provision clusters on Cloud or virtualization environments and keep them as "standby clusters" until it is required to go immediately in use at which point it is reconfigured as a day-2 operation completely eliminating the need for any type of installation of platform and other workload.
      3. Ability to create appliances with OCP which are then reconfigured as day-2 operation when arriving to their destination
      4. Ability to relocate cluster across domains or name schemes
      5. Ability for Telecommunication providers and large scale industrial deployments to follow a process where OCP is pre-installed from factory or on a staging facilities including all their specialized software stack on top of OCP, and have the ability to ship those pre-installed clusters (SNO, compact, multi-node) to the final locations and have them running with the site specific information by a day-2 reconfiguration.

       

      OpenShift (SNO) reconfiguration
      This capability is critical for fast deployment at the edge and for validating a complete solution before shipping to the edge.

      Upon deployment at the edge site, the SNO should allow reconfiguring specific cluster attributes for OCP to function correctly at the edge site.

      The provisioning and reconfiguration flow on TME:

      Telecommunication providers have existing Service Depots where they currently prepare SW/HW prior to shipping servers to Far Edge sites. pre-installing SNO onto servers in these facilities enables them to validate and update servers in these pre-installed server pools, as needed.

      Telecommunications Service Provider Technicians will be rolling out single node Openshift with a vDU configuration to new Far Edge sites. They will be working from a service depot where they will pre-install a set of Far Edge servers to be deployed at a later date. When ready for deployment, a technician will take one of these generic-OCP servers to a Far Edge site, enter the site-specific information, wait for confirmation that the vDU is in-service/online, and then move on to deploy another server to a different Far Edge site.

       

      Multinode reconfiguration (disaster recovery)

      An organization that requires by regulation or policy to maintain a DR process needs the ability to restore the OCP cluster on different locations (DR sites) which do not have the same network attributes (e.g. domain, IP scheme, etc) that the primary location. These organizations require a way to reconfigure the cluster to run in their DR sites. 

       

      Questions to Answer (Optional):

      Q: How to enable these changes at the far edge? see doc

      Q: For each type of change (e.g. changing the cluster name or changing the IP of a control plane node), what is the blast radius of the change for OCP core components, Which components are affected and how?

       

      Additional Considerations

       

      SNO Considerations:

      • Limited CPU and RAM resources, the customers expect the solution to have the same footprint as single node OpenShift, the relocation capability shouldn't require any additional resources.  
      • IPSec Support at Cluster Boot Some far-edge deployments occur on an insecure network and for that reason access to the host’s BMC is not allowed, additionally an IPSec tunnel must be established before any traffic leaves the cluster once its at the Far Edge site. It is not possible to enable IPSec on the BMC NIC and therefore even OpenShift has booted the BMC is still not accessible.
      • Static network Support- Other far edge deployments occur on environments without DHCP , in these deployments the networking and site-specific configuration can be provided via host’s BMC.

        

      Documentation Considerations

      Provide information that needs to be considered and planned so that documentation will meet customer needs.  Initial completion during Refinement status.

       

      Interoperability Considerations

      Which other projects and versions in our portfolio does this feature impact?  What interoperability test scenarios should be factored by the layered products?  Initial completion during Refinement status.

            dfroehli42rh Daniel Fröhlich
            ercohen Eran Cohen
            Tushar Katarki Tushar Katarki
            Votes:
            2 Vote for this issue
            Watchers:
            22 Start watching this issue

              Created:
              Updated:
              Resolved: