-
Epic
-
Resolution: Unresolved
-
Major
-
ACM 2.12.0
Epic Goal
- Enable native ClusterLogForwarder/OpenTelemetryCollector support to third-party outputs for managing log forwarding for a RHACM managed fleet of clusters.
- Support native ClusterLogForwarder/OpenTelemetryCollector authentication methods as in single cluster mode per supported output type.
- Enable native ClusterLogForwarder/OpenTelemetryCollector support per clusterset.
Non-Goals
- End to end log forwarding, storage and visualization support to LokiStack/OCP-Console in an RHACM managed fleet of clusters.
- Support for ClusterLogForwarder for managed non-OCP fleet clusters.
Why is this important?
Multi Cluster Observability is an integrated but optional component in the Red Hat Advanced Cluster Management (RHACM) product. Currently it supports only metrics collection, storage and visualization for infra components and user-workloads (See more on the RHACM Observability Docs) on OpenShift Container Platform (OCP) fleets as well as *K8s based fleets. The RHACM observability componetns is a hybrid of an operator and an addon (See RHACM Multi-Cluster-Observability-Operator (MCO)). Its architecture consists of a set of components to collect a pre-defined set of OCP metrics, visualizing them and alerting on fleet-relevant events.
The following EPIC is dedicated to enable fleet-wide as well as per-clusterset ClusterLogForwarder (CLF) support for RHACM. The proposed solution describes the workflow as in the enhancement proposal. In short a fleet administrator should be able to use the native resource ClusterLogForwarder resource as an input resource for the multi-cluster-observability-addon (MCOA or short addon) to provision the same log forwarding capabilities on an entire fleet of clusters or selected clustersets.
Scenarios
Workflow as proposed in ACM-DDR-025: Multi-cluster Observability Addon
Acceptance Criteria
- Given the fleet administrator creates a ClusterLogForwarder resource on a hub cluster when the addon is provisioned on that hub cluster then the it will provision the same ClusterLogForwader on the entire fleet clusters and each cluster will forward logs to the given ClusterLogForwarder outputs.
- Given the fleet administrator creates a ClusterLogForwarder resource on a hub cluster annotated with a list of clusterset names when the addon is provisioned on that hub cluster then the it will provision the same ClusterLogForwader only on the clusters in the listed clustersets.
- Given the fleet administrator creates a ClusterLogForwarder resource and a ConfigMap containing a mapping for each output to an authentication method when the method is mTLS then it will provision a individual set of client certificate per output for each selected cluster on the fleet.
Dependencies (internal and external)
N/A
Previous Work (Optional):
- The RHACM policy engine is part of the RHACM governance tools to apply certain policies on workloads and infrastructure on the entire fleet. It serves well to a degree when policies are easy to define and do not bear too many dependencies to each other or other components (e.g. installed OLM operators, TLS certificate generation). In contrast the managing multi-cluster logging means managing the OpenShift Logging Operator installation (i.e Cluster-Logging-Operator) and the custom resources (i.e. ClusterLogForwarder) as well as authentication related artifacts (i.e. generating TLS certificates, Cloud Provider Managed Identities / ServiceAccounts, etc.). Although all of these can be achieved using the RHACM policy engine, this approach falls short when it comes to error reporting (e.g. operator installation/upgrade failures, CRD handling) and health reporting (e.g. running/ready progress of operators and operands).
The spike for this alternative can be found here: https://gitlab.cee.redhat.com/openshift-logging/log-storage-toolbox/-/merge_requests/14 - For the untrained eye the RHACM MCO operator looks like the natural place to add multi-cluster logging capabilities. It has a hybrid architecture being a Kubernetes operator (i.e managing the MultiClusterObservability CRD) and an addon (i.e. provisions observability agents/policies to clustersets). Besides it's attractiveness the underlying design makes it hard to extend to further signals without a tremendous engineering effort (i.e. re-expose ClusterLogForwarder inside MultiClusterObservability CR, managing long-term version skew with logging operators, etc.).
Open questions:
- …
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub
Issue> - DEV - Upstream documentation merged: <link to meaningful PR or GitHub
Issue> - DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Doc issue opened with a completed template. Separate doc issue
opened for any deprecation, removal, or any current known
issue/troubleshooting removal from the doc, if applicable.