Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-5637

In-Cluster end-to-end support for OTLP-ingestion and OTEL Semantic Conventions

XMLWordPrintable

    • In-Cluster OTLP Support
    • False
    • None
    • False
    • Green
    • NEW
    • Administer, API, Instructions, Storage
    • To Do
    • OBSDA-849 - Enable Full In-Cluster OpenTelemetry Support In OpenShift Logging
    • OBSDA-849Enable Full In-Cluster OpenTelemetry Support In OpenShift Logging
    • NEW
    • 0% To Do, 0% In Progress, 100% Done
    • Technology Preview

      Goals

      • Provide an in-cluster log search and alerting experience based on OpenTelemetry Semantic Conventions
      • Provide OTLP-native log ingestion for third party trusted/verified collectors.
      • Provide native zero-effort OpenShift Logging Tenancy and RBAC integration to third-party trusted/verified collectors

      Non-Goals

      • Replace ViaQ model for OpenTelemetry Semantic Conventions entirely.
      • Open the in-cluster log store for public non-verified log ingestion.

      Motivation

      For years before OpenTelemetry formation, the OpenShift Logging product relied on encoding the log payload sent from collector to log store using the so called ViaQ format. With the emerge of OpenTelemetry a lot of our customers and internal teams ask for storing/retrieval of logs in OpenShift Logging's LokiStack using the OpenTelemetry Semantic Conventions. These conventions represent a cross obervability vendors' effort to unify on how to encode and retrieve observability signal data in general and for logs in particular. The impact of such a publicly maintained standard for observability data encoding is huge to the extend that it yields:

      • Easy and recognizable fields to retrieve log records.
      • Easy cross-platform observability correlation, i.e. across different store solutions
      • Unified query and alerting experience across signal.

      Furthermore having a log storage solution like LokiStack to ingest logs via OTLP using the OpenTelemetry Semantic Conventions extends the prevailing momentum to allow in-cluster ingestion from other trusted collectors/forwarders, i.e. OpenTelemetryCollector sending application logs collected via instrumentation.

      Alternatives

      As an alternative to the above proposed solution (support of OpenTelemetry Semantic Conventions formated log records via OTLP in addition to ViaQ) we could provide a mapping facility from OpenTelemetry Semantic Conventions to ViaQ. Considering the fact that the Log Console is written with ViaQ in mind, this alternative would make changes in the UI obsolete. However with the growing amendments in the Red Hat OpenTelemetry Semantic Conventions we would need to keep up with that pace.

      Acceptance Criteria

      1. Given the logging administrator creates a ClusterLogForwarder resource when a LokiStack output type is provided and configured with the OTEL data model, then the vector component forwards logs to LokiStack using Loki's 3.x OTLP endpoint.
      2. Given the cluster administrator creates an OpenTelemetryCollector resource, when an exporter to LokiStack is provided, then OpenTelemetry collector forwards logs to LokiStack Loki's 3.x OTLP endpoint.
      3. Given the cluster administrator searching for logs in the Log Console given using OpenTelemetry Semantic Conventions to name labels and filter in LogQL, then LokiStack returns log records matching those.
      4. Given the cluster administrator searching for logs by using one Log Console UI filters (currently severity/namespaces/pod/containers/tenant) , then LokiStack returns log records matching those by using OpenTelemetry Semantic Conventions to name labels and filter in LogQL.
      5. Given the developer searching for logs across namespace in the Log Console, then LokiStack returns log records matching those by using OpenTelemetry Semantic Conventions to name labels and filter in LogQL.

      Risk and Assumptions

      The Log Console is currently assuming that log records returned by LokiStack have the ViaQ format. In addition having a default retention of 30 days (currently the maximum supported in OpenShift Logging releases) while having logs being already ingested using ViaQ and new logs using OpenTelemetry Semantic Conventions requires the Log Console to be smart and translate UI filter into LogQL stream selector for both formats.

      Documentation Considerations

          • This upstream description provides important details on how users can search their logs, e.g. what is indexed as stream label and what is collected as structured metadata.
      • In the Configuning Lokistack section, we need another subsection that describes how to tune the LokiStack behavior for OpenShift Logging as described in this PR: https://github.com/grafana/loki/pull/14410  
        • These OTLP LokiStack tuning provided are the replacement for the ClusterLogForwarder field lokistack.spec.outputs[].lokiStack.labelKeys
      • Furthermore:
        • The upstream data model documentation provides a set of attributes that are deprecated but used as a compatibility layer for the UI to continue to work. namely log_source, log_type, kubernetes.namespace_name, kubernetes.container_name, kubernetes.pod_name . This should be highlighted that we will continue supporting them until the Logging UI supports exclusively the OpenTelemetry counterparts in future releases.

      Open Questions

      N/A

      Additional Notes

      N/A

              rojacob@redhat.com Robert Jacob
              ptsiraki@redhat.com Periklis Tsirakidis
              Kabir Bharti Kabir Bharti
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

                Created:
                Updated:
                Resolved: