-
Feature
-
Resolution: Unresolved
-
Major
-
None
-
None
-
BU Product Work
-
False
-
-
False
-
OCPSTRAT-27OLM V1: Operators, Operator Lifecycle Management, and Operator Hub
-
0
Feature Overview
- When installing and updating extensions using OLM 1.0, it should carry out a set of preflight checks to prevent installations from going into a failed state as a result of attempting an upgrade or install.
- Preflight checks should include availability and (if applicable) health of (running) dependencies and any cluster runtime constraints (F19).
Goals
- Users have a better way to troubleshoot issues by relying on a fail-fast approach where OLM 1.0, ahead of trying to bring up an extension in the cluster, performs checks on conditions and dependencies that would make the operator fail
- Compared to today, these failures would be raised faster and be presented immediately to the user in an easy-to-discover way
- Preflight checks are run as a result of constraints the extension specifies, either against properties of other extensions or properties of the cluster that they are installed on
- Preflight checks provide input to the users on whether or not the extension is going to work on the cluster but they are not gating
Requirements
Requirement | Notes | isMvp? |
---|---|---|
Preflight checks are run before installation | YES | |
Preflight checks are run before updates | YES | |
Preflight checks are fatal by default but can be made non-fatal with a user-supplied force override | YES | |
Failed Preflight checks are reported so they can be easily discovered by the cluster admin | YES | |
Failed Preflight checks do not create resources that are left to be cleaned up manually | YES | |
Failed Preflight checks as part of updates do not change the health state of the currently installed extensions | YES | |
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
Use Cases
Main Use Case:
- An extension defines a constraint in the form of a minimum version of Kubernetes. Attempts to install the extension on a cluster version below the minimum version fail immediately without changing the cluster state.
- An extension has a constraint in the form of a minimum version of Kubernetes that is currently satisfied. Attempting to update the extension to a version that has a constraint on the version of Kubernetes higher than what the cluster is currently running fails immediately without changing the cluster state.
- Extensions run on clusters to provide a managed service that has aggressive uptime SLOs. OLM 1.0 performing pre-flight checks in a fail-fast manner that is non-disruptive to the currently running extensions helps SRE teams maintain their SLOs.
Definition of Done / Acceptance criteria
- All requirements required for MVP are implemented
Background, and strategic fit
This is part of a larger effort to re-design vital parts of the OLM APIs and conceptual models to fit the use case of OLM in managed service environments, GitOps-controlled infrastructure, and restrictive self-managed deployments in Enterprise environments. You can learn more about it here: https://docs.google.com/document/d/1LX4dJMbSmuIMn98tCiBaONmTunWR6zdUJuH-8uZ8Cno/edit?usp=sharing
Assumptions
- Ultimately, a flexible constraint modeling mechanism should exist for OLM-packaged extensions to express dependencies on other extensions (as detailed in OLM PRD F19).
Relevant upstream CNCF OLM engineering brief(s):
Documentation Considerations
- The way and order in which constraints are evaluated as part of install / update preflight checks are documented
- The place in which information about failed pre-flight checks can be found needs to be documented
- Users need to understand how they can force an extension to install/update despite failed preflight checks
- relates to
-
OCPSTRAT-443 [Phase 1 MVP/Tech Preview] OLM 1.0 - Extension Installation (F7)
- Closed
-
OCPSTRAT-450 [Phase 1 MVP/Tech Preview] OLM 1.0 - Extension updates (F10)
- Closed