Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-192

OLM v1: Installation and Update preflight checks (F6)

XMLWordPrintable

    • Strategic Product Work
    • False
    • Hide

      None

      Show
      None
    • False
    • OCPSTRAT-27OLM V1: Operators, Operator Lifecycle Management, and Operator Hub
    • 0

      Feature Overview

      • When installing and updating extensions using OLM 1.0, it should carry out a set of preflight checks to prevent installations from going into a failed state as a result of attempting an upgrade or install.
      • Preflight checks should include availability and (if applicable) health of (running) dependencies and any cluster runtime constraints (F19).

      Goals

      • Users have a better way to troubleshoot issues by relying on a fail-fast approach where OLM 1.0, ahead of trying to bring up an extension in the cluster, performs checks on conditions and dependencies that would make the operator fail 
      • Compared to today, these failures would be raised faster and be presented immediately to the user in an easy-to-discover way
      • Preflight checks are run as a result of constraints the extension specifies, either against properties of other extensions or properties of the cluster that they are installed on
      • Preflight checks provide input to the users on whether or not the extension is going to work on the cluster but they are not gating

      Requirements

       

      Requirement Notes isMvp?
      Preflight checks are run before installation   YES
      Preflight checks are run before updates   YES
      Preflight checks are fatal by default but can be made non-fatal with a user-supplied force override   YES
      Failed Preflight checks are reported so they can be easily discovered by the cluster admin   YES
      Failed Preflight checks do not create resources that are left to be cleaned up manually   YES
      Failed Preflight checks as part of updates do not change the health state of the currently installed extensions   YES
      CI - MUST be running successfully with test automation This is a requirement for ALL features. YES
      Release Technical Enablement Provide necessary release enablement details and documents. YES

      Use Cases

      Main Use Case:

      • An extension defines a constraint in the form of a minimum version of Kubernetes. Attempts to install the extension on a cluster version below the minimum version fail immediately without changing the cluster state.
      • An extension has a constraint in the form of a minimum version of Kubernetes that is currently satisfied. Attempting to update the extension to a version that has a constraint on the version of Kubernetes higher than what the cluster is currently running fails immediately without changing the cluster state.
      • Extensions run on clusters to provide a managed service that has aggressive uptime SLOs. OLM 1.0 performing pre-flight checks in a fail-fast manner that is non-disruptive to the currently running extensions helps SRE teams maintain their SLOs.

      Definition of Done / Acceptance criteria

      • All requirements required for MVP are implemented

      Background, and strategic fit

      This is part of a larger effort to re-design vital parts of the OLM APIs and conceptual models to fit the use case of OLM in managed service environments, GitOps-controlled infrastructure, and restrictive self-managed deployments in Enterprise environments. You can learn more about it here: https://docs.google.com/document/d/1LX4dJMbSmuIMn98tCiBaONmTunWR6zdUJuH-8uZ8Cno/edit?usp=sharing

      Assumptions

      • Ultimately, a flexible constraint modeling mechanism should exist for OLM-packaged extensions to express dependencies on other extensions (as detailed in OLM PRD F19).

      Relevant upstream CNCF OLM  engineering brief(s):

      Documentation Considerations

      • The way and order in which constraints are evaluated as part of install / update preflight checks are documented
      • The place in which information about failed pre-flight checks can be found needs to be documented
      • Users need to understand how they can force an extension to install/update despite failed preflight checks

              DanielMesser Daniel Messer
              DanielMesser Daniel Messer
              Jian Zhang Jian Zhang
              Matthew Werner Matthew Werner
              Joe Lanford Joe Lanford
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: