Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-2843

Version-specific Cluster Upgrade Preflight Checks (tech-preview)

XMLWordPrintable

    • Product / Portfolio Work
    • OCPSTRAT-2837OpenShift Upgrades - MVP
    • 100% To Do, 0% In Progress, 0% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • Tech Preview
    • None
    • None
    • None
    • None
    • None
    • None

      Feature Overview (aka. Goal Summary)

      Cluster admins will gain the ability to trigger an on-demand, low-impact compatibility report against a specific target image. This report provides a clear list of blocking concerns and actionable steps required to unblock the update. This Feature is delivering the tech-preview work. The follow-up work to make the functionality generally available is tracked in OCPSTRAT-2815.

      Goals (aka. expected user outcomes)

      This feature introduces a pre-update preflight harness that allows OpenShift clusters to run compatibility logic extracted directly from a candidate target release. By moving away from static backported checks, we enable more accurate, scalable, and automated readiness validations—including support for future skip-level updates—ensuring cluster admins have high confidence before initiating an update.

      • Elimination of Manual Audits: For example, clusters using manual-mode cloud credentials (CCO), admins currently perform manual compatibility audits. This feature automates that process, warning the admin only when changes are strictly required.
      • Reduced "Multi-Hop" Requirements: Currently, admins often must update to the latest "z-stream" of a minor release just to pick up new upgrade guards. By extracting logic from the target payload, admins can move directly from an older $4.(y-1)$ to a new $4.y$ release with the latest guards already active.
      • Future-Proofing for Skip-Levels: This architecture is a prerequisite for safe skip-level updates, as it allows a 4.y cluster to evaluate compatibility logic defined in a 4.(y+2) payload, which the current Upgradeable condition cannot achieve.

      Requirements (aka. Acceptance Criteria)

      • Automated Validation: A cluster-admin can successfully request a preflight check vs. a target release and receive a report containing both blocking concerns and actionable remediation steps.
      • Skipped releases: A cluster admin can successfully requesta preflight check vs. a target update release which is not the next immediate minor version (e.g. 4.22 -> 5.0 or 5.2 -> 5.5)
      • Harness Extraction: Verification that the logic is successfully extracted and executed from the target release payload rather than the local cluster state.
      • Documentation: Complete technical and user-facing documentation for running preflights in Tech Preview.
      • Technical Approval: Successful API-review and enhancement approval for associated openshift/api changes.

       

      Deployment considerations List applicable specific needs (N/A = not applicable)
      Self-managed, managed, or both this Feature is about self-managed. Managed may want to wrap it, but no known tracker for them ye
      Classic (standalone cluster) yes, target goal
      Hosted control planes out of scope for 4.22, given limited timeline
      Multi node, Compact (three node), or Single node (SNO), or all all
      Connected / Restricted Network all
      Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) all
      Operator compatibility N/A
      Backport needed (list applicable versions) N/A
      UI need (e.g. OpenShift Console, dynamic plugin, OCM) due to limited time, Console work will only begin in the GA OCPSTRAT-2815
      Other (please specify)  

      Use Cases

      For component maintainers, this allows for compatibility checks to ship with the release that requires them. This should be less work than the current approach, where "what's about to come in 4.y" knowledge comes in 4.y, and then needs to be backported to 4.(y-1) controllers to set the Upgradeable condition. One explicit example would be automating the manual cloud-cred compatibility check for clusters that use manual-mode credentials. In addition, preflights extracted from the update-target payload would scale conveniently to skip-level updates (OCPSTRAT-2638) while the Upgradeable condition approach is limited to discussing the next minor release.

      For update graph-data admins, long-running update risks like IPsecLargeClusterConnectivity (CORENET-6196) could be declared "fixed" when the risk-detection moved into the component operator, reducing the number of situations where we had to continue asking clusters to evaluate PromQL. This also reduces the number of situations where we'd need to raise the minor_min version to pick up new guard logic (e.g. graph-data#8528.

      For cluster-admins, they gain the ability to have safe skip-level updates, if we move ahead with OCPSTRAT-2638. And regardless of whether we move ahead with OCPSTRAT-2638, they have a better chance of being able to do a direct 4.(y-1).old > 4.y new, vs. the current flow where they sometimes need a 4.(y-1).old > 4.(y-1).new > 4.y multi-hop to pick up a new guard with a minor_min bump. And cluster-admins using manual mode cloud credentials would not longer need to manually check those for compatibility with the new release.

      Questions to Answer

      None

      Out of Scope

      Console integration, HyperShift integration, MicroShift, and managed/OCM integration are all out of scope for this 4.22 tech-preview work.

      Background

      Changes vs. the current state are covered in the earlier Use Cases section.

      Customer Considerations

      Not applicable. Or maybe I just don't understand what customer considerations are.

      Documentation Considerations

      This will extend our current standalone OCP update docs. Unclear if we get into the complication of OSDOCS-15792.

      Interoperability Considerations

      ARO will get this for free, although they might need to update their documentation to allow their customers to access the functionality.

      ROSA/OSD would need OCM changes, and nobody's reached out to the OCM folks yet to ask if they're interested.

              rh-ee-smodeel Subin M
              DanielMesser Daniel Messer
              None
              None
              W. Trevor King W. Trevor King
              Jia Liu Jia Liu
              Courtney Bippley Courtney Bippley
              Eric Rich Eric Rich
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: