-
Feature
-
Resolution: Unresolved
-
Undefined
-
None
-
False
-
None
-
False
-
Not Selected
-
0
-
100% To Do, 0% In Progress, 0% Done
Background
Application observability
Historically, observability in OpenShift has mostly been focused on platform observability across pillars that also offered some great capabilities for applications. However, the platform was falling short on:
- Providing capabilities designed to observe applications as a whole in the UI
- Providing a unified UI for all pillars
- Unified and easy installation
Since the addition of (just to name a few):
- Application Performance Monitoring view, that needs the following to be installed and properly configured:
- OpenTelemetry Collector producing metrics out of traces
- Prometheus to collect those metrics
- Tempo stack
- UI plugin (that nowadays is installed via COO)
- Distributed tracing plugin, that needs COO and tenancy for read and write operations, to provide the following features:
- Traces scatter plot
- Traces table
- Gantt chart
- Links between spans and pod logs and metrics
- Tempo Monolithic dpeloyment that helps users to install easily an in-memory tracing stack
- Highly configurable Prometheus stack part of COO
Application observability in OpenShift has all the components to provide a good experience to users. However, installation of it it's not straight forward, and many users fallback, for application observability, to other options such as third party observability vendors or open source solutions (for example, Grafana + LGTM stack)
Defining and using tenants
Also, after both the security hardening in the Jaeger UI and the addition of the Tracing plugin in the console, it's mandatory to have tenants for both read and write well defined. While this is an improvement in security, it may lead to make it harder for users to:
- Start using distributed tracing
- Configure environments
- Easily debug issues by quicky installing a Tempo instance to troubleshoot ongoing incidents
- Install tracing in development environments
Requirements
That's why, as part of this feature, a solution shall be delivered that:
- Installs an application observability stack that provides metrics and traces collection, storage and visualization, including
- APM that includes relevant RED metrics
- Distributed Tracing to monitor, troubleshoot and link to other signals the path of a request.
- Installs one or multiple tenants via RBAC based on customer needs. The following may be considered
- Optional default tenant installation
- Let the user define "who" is writing and "who" is reading ("who" means which tenant)
Future steps
- To be discussed if we can deliver them now
- Logs can be part of the pod console output for now (but open to suggestions, or to extend this in future improvements) due to the fact that they always need an object store (no monolithic)
- Dashboarding for applications
List any affected packages or components.
- Red Hat build of OpenTelemetry
- Tempo
- Tracing plugin