- When installing and updating extensions using OLM 1.0, it should carry out a set of preflight checks to prevent installations from going into a failed state as a result of attempting an upgrade or install.
- Preflight checks should include availability and (if applicable) health of (running) dependencies and any cluster runtime constraints (F19).
- Users have a better way to troubleshoot issues by relying on a fail-fast approach where OLM 1.0, ahead of trying to bring up an extension in the cluster, performs checks on conditions and dependencies that would make the operator fail
- Compared to today, these failures would be raised faster and be presented immediately to the user in an easy-to-discover way
- Preflight checks are run as a result of constraints the extension specifies, either against properties of other extensions or properties of the cluster that they are installed on
- Preflight checks provide input to the users on whether or not the extension is going to work on the cluster but they are not gating
|Preflight checks are run before installation
|Preflight checks are run before updates
|Preflight checks are fatal by default but can be made non-fatal with a user-supplied force override
|Failed Preflight checks are reported so they can be easily discovered by the cluster admin
|Failed Preflight checks do not create resources that are left to be cleaned up manually
|Failed Preflight checks as part of updates do not change the health state of the currently installed extensions
|CI - MUST be running successfully with test automation
|This is a requirement for ALL features.
|Release Technical Enablement
|Provide necessary release enablement details and documents.
Main Use Case:
- An extension defines a constraint in the form of a minimum version of Kubernetes. Attempts to install the extension on a cluster version below the minimum version fail immediately without changing the cluster state.
- An extension has a constraint in the form of a minimum version of Kubernetes that is currently satisfied. Attempting to update the extension to a version that has a constraint on the version of Kubernetes higher than what the cluster is currently running fails immediately without changing the cluster state.
- Extensions run on clusters to provide a managed service that has aggressive uptime SLOs. OLM 1.0 performing pre-flight checks in a fail-fast manner that is non-disruptive to the currently running extensions helps SRE teams maintain their SLOs.
- All requirements required for MVP are implemented
This is part of a larger effort to re-design vital parts of the OLM APIs and conceptual models to fit the use case of OLM in managed service environments, GitOps-controlled infrastructure, and restrictive self-managed deployments in Enterprise environments. You can learn more about it here: https://docs.google.com/document/d/1LX4dJMbSmuIMn98tCiBaONmTunWR6zdUJuH-8uZ8Cno/edit?usp=sharing
- Ultimately, a flexible constraint modeling mechanism should exist for OLM-packaged extensions to express dependencies on other extensions (as detailed in OLM PRD F19).
- The way and order in which constraints are evaluated as part of install / update preflight checks are documented
- The place in which information about failed pre-flight checks can be found needs to be documented
- Users need to understand how they can force an extension to install/update despite failed preflight checks