Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-2981

Fleet-wide service mesh management with ACM

XMLWordPrintable

    • Icon: Outcome Outcome
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • None
    • Product / Portfolio Work
    • 100% To Do, 0% In Progress, 0% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • None

      Outcome Overview

      The desired outcome is for service mesh users to use Red Hat Advanced Cluster Management (ACM) to more efficiently manage their service mesh resources at scale than they are able to today. 

      This new set of features will improve the competitiveness of both Red Hat ACM and OpenShift Service Mesh, driving upsells to OpenShift Platform Plus. 

      While service mesh (Istio) includes features to enable multi-cluster use cases, configuring and managing this set of features requires a substantial amount of configuration management, such as:

      • Installing service mesh, onboarding clusters (setting up gateways, trust, service discovery in remote clusters)
        • Setting up topologies: multi-primary, primary-remote, external control planes
        • Configuring integrations such as observability (metrics, logs, traces) and Kiali
      • Onboarding application workloads, which may be deployed in multiple clusters and/or mesh instances.
        • Creating/managing Service resources for DNS entries
      • Using service mesh features (configuring security and traffic management policies, using observability features) in the context of multi-cluster.
        • As a user, I do not want to repeat the same configuration for each service mesh instance.
      • Configuring and managing tenants in a multi-tenant service mesh deployment (within one or multiple clusters)
      • Maintaining service mesh instances across upgrades

      Doing all of this with a small environment is achievable with customer DIY configuration management - using OpenShift GitOps, Ansible or custom ACM policies. At scale, this becomes more tedious and potentially error prone.

      As these are largely repeatable tasks, we should be able to provide pre-built ACM customizations that save customers time and money when deploying and managing service mesh at scale.

      From a competitive perspective, these features aim to provide an alternative to competing offerings from Solo.io (Gloo Mesh) and Tetrate.io (Service Bridge), which we know customers like USAA, Amex and BlackRock consider as “must have” for their service mesh deployments at scale. 

      Success Criteria

      What is the success criteria for this strategic outcome?  Avoid listing Features or Initiatives and instead describe "what must be true" for the outcome to be considered delivered.

      The feature should save customers time and money when they use it to configure, observe and manage one or more service mesh instances at a large scale (5+ clusters up to 300) compared to doing this with standalone OCP environments. 

      It should provide a high-level single console view (“pain of glass”) that can be used as the starting point for observing/managing a large group of service mesh instances spread across many clusters, with the ability to drop down to inspect/configure individual meshes.

      It should provide resources to assist with a large-scale mesh deployment to be managed using GitOps approaches consistent with how Kubernetes resources are typically managed. It should also provide flexibility with these resources - all default settings should be customizable. It should use existing resources from the sail-operator, Istio, Envoy and Kiali as much as possible and avoid creating "wrappers" that would be difficult to maintain. 

      It should help with platform integrations (other OCP and ACM features). We should not need to duplicate functions in the addon that are already handled elsewhere.

      It should be possible for existing users of ACM and OSSM to migrate their workloads to this addon.

      This should be seen as a significant value add for both OSSM and ACM, that would remove the need for most customers to use a solution like Solo.io’s Gloo Mesh or Tetrate.io’s Service Bridge.

      Expected Results (what, how, when)

      This outcome will be delivered over multiple releases, starting with an initial MVP that should address some, but not all of the use cases mentioned earlier. This will evolve over future releases. The initial iteration may only provide an onramp for ACM users to install service mesh, with subsequent releases adding more features and automation.

      The expected result is to drive and solidify customer usage of ACM with OpenShift Service Mesh and increase the sales rate of OPP where customers have a need for a multi-cluster service mesh (often motivated by high-availability use cases). We can look at the number and value of new deals (and renewals) where ACM and Service Mesh are factored in.

      For existing service mesh customers who are only subscribed to OCP, this will provide an opportunity to upsell to OPP for centralized management. We can look at the number and value of customers we are able to convert from OCP to OPP.

      Red Hat has a few large customers who have chosen to use similar offerings from Solo/Tetrate. Migrating to a Red Hat offering would require great effort, but these customers have indicated a desire to use a supported Red Hat offering when possible. We should return to these customers with feedback, and to understand what it would take them to migrate.

      We will also include telemetry to be able to monitor how many current ACM + OSSM customers are adopting this feature set. We know that there is a lot of overlap between these two products.

      With regard to timelines, as this feature is in support of an advanced use case, the earliest indicators will be expressions of interest from customers, followed by PoCs, with production usage likely taking 6 months to 1 year post initial GA time. 

       

      Post Completion Review – Actual Results

      After completing the work (as determined by the "when" in Expected Results above), list the actual results observed / measured during Post Completion review(s).

       

              jlongmui@redhat.com Jamie Longmuir
              jlongmui@redhat.com Jamie Longmuir
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: