Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-1708

Control Plane fleet wide fix delivery mechanism

XMLWordPrintable

    • BU Product Work
    • False
    • Hide

      None

      Show
      None
    • False
    • 67% To Do, 33% In Progress, 0% Done
    • 7
    • 0
    • Program Call

      Feature Overview (aka. Goal Summary)  

      A common concern with dealing with escalations/incidents in Managed OpenShift Hosted Control Planes is the resolution time incurred when the fix needs to be delivered in a component of the solution that ships within the OpenShift release payload. This is because OpenShift's release payloads:

      •  Have a hotfix process that is customer/support-exception targeted rather than fleet targeted
      • Can take weeks to be available for Managed OpenShift

      This feature seeks to provide mechanisms that put the upper time boundary in delivering such fixes to match the current HyperShift Operator <24h expectation

      Goals (aka. expected user outcomes)

      • Hosted Control Plane fixes are delivered through Konflux builds
      • No additional upgrade edges
      • Release specific
      • Adequate, fleet representative, automated testing coverage
      • Reduced human interaction

      Requirements (aka. Acceptance Criteria):

      A list of specific needs or objectives that a feature must deliver in order to be considered complete.  Be sure to include nonfunctional requirements such as security, reliability, performance, maintainability, scalability, usability, etc.  Initial completion during Refinement status.

      • Overriding Hosted Control Plane components can be done automatically once the PRs are ready and the affected versions have been properly identified
      • Managed OpenShift Hosted Clusters have their Control Planes fix applied without requiring customer intervention and without workload disruption beyond what might already be incurred because of the incident it is solving
      • Fix can be promoted through integration, stage and production canary with a good degree of observability

       

      Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed.  Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.

      Deployment considerations List applicable specific needs (N/A = not applicable)
      Self-managed, managed, or both managed (ROSA and ARO)
      Classic (standalone cluster) No
      Hosted control planes Yes
      Multi node, Compact (three node), or Single node (SNO), or all All supported ROSA/HCP topologies
      Connected / Restricted Network All supported ROSA/HCP topologies
      Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) All supported ROSA/HCP topologies
      Operator compatibility CPO and Operators depending on it
      Backport needed (list applicable versions) TBD
      UI need (e.g. OpenShift Console, dynamic plugin, OCM) No
      Other (please specify) No

      Use Cases (Optional):

      • Incident response when the engineering solution is partially or completely in the Hosted Control Plane side rather than in the HyperShift Operator

      Out of Scope

      • HyperShift Operator binary bundling

      Background

      Discussed previously during incident calls. Design discussion document

      Customer Considerations

      • Because the Managed Control Plane version does not change but it is overridden, customer visibility and impact should be limited as much as possible.

      Documentation Considerations

      SOP needs to be defined for:

      • Requesting and approving the fleet wide fixes described above
      • Building and delivering them
      • Identifying clusters with deployed fleet wide fixes

              Unassigned Unassigned
              asegurap1@redhat.com Antoni Segura Puimedon
              He Liu He Liu
              Laura Hinson Laura Hinson
              Cesar Wong Cesar Wong
              Adel Zaalouk Adel Zaalouk
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: