XMLWordPrintable

    • Strategic Product Work
    • False
    • Hide

      None

      Show
      None
    • False
    • OCPSTRAT-27OLM V1: Operators, Operator Lifecycle Management, and Operator Hub
    • 0% To Do, 6% In Progress, 94% Done
    • 0
    • Program Call

      Feature Overview (aka. Goal Summary)  

      • With this next-gen OLM GA release (graduated from ‘Tech Preview’), customers can: 
        • discover collections of k8s extension/operator contents released in the FBC format with richer visibility into their release channels, versions, update graphs, and the deprecation information (if any) to make informed decisions about installation and/or update them.
        • install a k8s extension/operator declaratively and potentially automate with GitOps to ensure predictable and reliable deployments.
        • update a k8s extension/operator to a desired target version or keep it updated within a specific version range for security fixes without breaking changes.
        • remove a k8s extension/operator declaratively and entirely including cleaning up its CRDs and other relevant on-cluster resources (with a way to opt out of this coming up in a later release).
      • To address the security needs of 30% of our customers who run clusters in disconnected environments, the GA release will include cluster extension lifecycle management functionality for offline environments.
      • [Tech Preview] (Cluster)Extension lifecycle management can handle runtime signature validation for container images to support OpenShift’s integration with the rising Sigstore project for secure validation of cloud-native artifacts,

      Goals (aka. expected user outcomes)

      1. Pre-installation:

      • Customers can access a collection of k8s extension contents from a set of default catalogs leveraging the existing catalog images shipped with OpenShift (in the FBC format) with the new Catalog API from the OLM v1 GA release.
      • With the new GAed Catalog API, customers get richer package content visibility in their release channels, versions, update graphs, and the deprecation information (if any) to help make informed decisions about installation and/or update.
      • With the new GAed Catalog API, customers can render the catalog content in their clusters with fewer resources in terms of CPU and memory usage and faster performance.
      • Customers can filter the available packages based on the package name and see the relevant information from the metadata shipped within the package. 

      2. Installation:

      • Customers using a ServiceAccount with sufficient permissions can install a k8s extension/operator with a desired target version or the latest version within a specific version range (from the associated channel) to get the latest security fixes.
      • Customers can easily automate the installation flow declaratively with GitOps to ensure predictable and reliable deployments.
      • Customers get protection from having two conflicting k8s extensions/operators owning the same API objects, i.e., no conflicting ownership, ensuring cluster stability.
      • Customers can access the* metadata of the installed k8s extension/operator to see essential information such as its provided APIs, example YAMLs of its provided APIs, descriptions, infrastructure features, valid subscriptions, etc.

      3. Update:

      • Customers can see what updates are available for their k8s extension/operators in the form of immediate target versions and the associated update channels.
      • Customers can trigger the update of a k8s extension/operator with a desired target version or the latest version within a specific version range (from the associated channel) to get the latest security fixes.
      • Customers get protection from workload or k8s extension/operator breakage due to CustomResourceDefinition (CRD) being upgraded to a backward incompatible version during an update.
      • During OpenShift cluster update, customers* get Informed when installed k8s extensions/operators ** do not support the next OpenShift version *(when annotated by the package author/provider).  Customers must update those k8s extensions/operators to a newer/compatible version before OLM unblocks the OpenShift cluster update. 

      4. Uninstallation/Deletion:

      • Customers can cleanly remove an installed k8s extension/operator including deleting CustomResourceDefinitions (CRDs), custom resource objects (CRs) of the CRDs, and other relevant resources to revert the cluster to its original state before the installation declaratively.

      5. Disconnected Environments for High-Security Workloads:

      • Approximately 30% of our customers prioritize high security by running their clusters in internet-disconnected environments, especially for mission-critical production workloads. To benefit these users, our supported GA release needs to include cluster extension lifecycle management functionality that functions within these disconnected environments.

      6. [Tech Preview] Signature Validation for Secure Workflows:

      • The Red Hat-sponsored Sigstore project is gaining traction in the Kubernetes community, aiming to simplify the signing of cloud-native artifacts. OpenShift leverages Sigstore tooling to enable scalable and flexible signature validation, including support for disconnected environments. This functionality will be available as a Tech Preview in 4.17 and is targeted for General Availability (GA) Tech Preview Phase 2 in the upcoming 4.18 release. To fully support this integration as a Tech Preview release, the (cluster)extension lifecycle management needs to (be prepared to) handle runtime validation of Sigstore signatures for container images.

      Requirements (aka. Acceptance Criteria):

      All the expected user outcomes and the acceptance criteria in the engineering epics are covered.

      Background

      OLM: Gateway to the OpenShift Ecosystem

      Operator Lifecycle Manager (OLM) has been a game-changer for OpenShift Container Platform (OCP) 4.  Since its launch in 2019, OLM has fostered a rich ecosystem, expanding from a curated set of 25 operators to over 100 officially supported Red Hat operators and hundreds more from certified ISVs and the community.

      OLM empowers users to manage diverse technologies with ease, including ACM, ACS, Quay, GitOps, Pipelines, Service Mesh, Serverless, and Virtualization.  It has also facilitated the introduction of groundbreaking operators for entirely new workloads, like Nvidia GPU, PTP, Windows Machine Config, SR-IOV networking, and more.  Today, a staggering 91% of our connected customers leverage OLM's capabilities.

      OLM v0: A Stepping Stone

      While OLM v0 has been instrumental, it has limitations.  The API design, not fully GitOps-friendly or entirely declarative, presents a steeper learning curve due to its complexity.  Furthermore, OLM v0 was designed with the assumption of namespace-scoped CRDs (Custom Resource Definitions), allowing for independent operator installations and parallel versions within a single cluster.  However, this functionality never materialized in core Kubernetes, and OLM v0's attempt to simulate it has introduced limitations and bugs.

      The Operator Framework Team: Building the Future

      The Operator Framework team is the cornerstone of the OpenShift ecosystem.  They build and manage OLM, the Operator SDK, operator catalog formats, and tooling (opm, file-based catalogs).  Their work directly impacts how operators are developed, packaged, delivered, and managed by users and SRE teams on OpenShift clusters.

      A Streamlined Future with OLM v1

      The Operator Framework team has undergone significant restructuring to focus on the next generation of OLM – OLM v1.  This transition includes moving the Operator SDK to a feature-complete state with ongoing maintenance for compatibility with the latest Kubernetes and controller-runtime libraries.  This strategic shift allows the team to dedicate resources to completely revamping OLM's API and management concepts for catalog content delivery.  

      Leveraging learnings and customer feedback since OCP 4's inception, OLM v1 is designed to be a major overhaul, and it will be shipped as a Generally Available (GA) feature in OpenShift 4.17.

      Customer Considerations

      Provide any additional customer-specific considerations that must be made when designing and delivering the Feature.  Initial completion during Refinement status.

      <your text here>

      Documentation Considerations

      1. Pre-installation:

      • [GA release] Docs provide instructions on how to add Red Hat-provided Operator catalogs with the pull secret for catalogs hosted on a secure registry.
      • [GA release] Docs provide instructions on how to discover the Operator packages from a catalog.
      • [GA release] Docs provide instructions on how to query and inspect the metadata of Operator bundles and find feasible ones to be installed with the OLM v1.

      2. Installation:

      • [GA release] Docs provide instructions on how to use a ServiceAccount with sufficient permissions to install a k8s extension/operator with a desired target version or the latest version within a specific version range to get the latest security fixes.
      • [GA release] Docs provide instructions on how to automate the installation flow declaratively with GitOps to ensure predictable and reliable deployments.
      • [GA release] Docs mention the OLM v1’s protection from having two conflicting k8s extensions/operators owning the same API objects, i.e., no conflicting ownership, ensuring cluster stability.
      • [GA release] Docs provide instructions on how to access the metadata of the installed k8s extension/operator to see essential information such as its provided APIs, example YAMLs of its provided APIs, descriptions, infrastructure features, valid subscriptions, etc.
      • [GA release] Docs explain how to create RBACs from a CRD to grant cluster users access to the installed k8s extension/operator's provided APIs.

      3. Update:

      • [GA release] Docs provide instructions on how to see what updates are available for their k8s extension/operators in the form of immediate target versions and the associated update channels.
      • [GA release] Docs provide instructions on how to trigger the update of a k8s extension/operator with a desired target version or the latest version within a specific version range to get the latest security fixes.
      • [GA release] Docs mention OLM v1’s protection from workload or k8s extension/operator breakage due to CustomResourceDefinition (CRD) being upgraded to a backward incompatible version during an update.
      • [GA release] Docs mention OLM v1 will block the OpenShift cluster update if installed k8s extensions/operators do not support the next OpenShift version (when annotated by the package author/provider).  Provide instructions on how to find and update to a newer/compatible version before OLM unblocks the OpenShift cluster update.

      4. Uninstallation/Deletion:

      • [GA release] Docs provide instructions on how to cleanly remove an installed k8s extension/operator including deleting CustomResourceDefinitions (CRDs), custom resource objects (CRs) of the CRDs, and other relevant resources.
      • [GA release] Docs provide instructions to verify the cluster has been reverted to its original state after uninstalling a k8s extension/operator

      Relevant upstream CNCF OLM v1 requirements, engineering brief, and epics:

      1. Pre-installation:

      2. Installation:

      3. Update:

      4. Uninstallation/Deletion:

      Relevant documents:

       

              rhn-coreos-tunwu Tony Wu
              rhn-coreos-tunwu Tony Wu
              Matthew Werner Matthew Werner
              Joe Lanford Joe Lanford
              Eric Rich Eric Rich
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

                Created:
                Updated: