Uploaded image for project: 'Observability Documentation'
  1. Observability Documentation
  2. OBSDOCS-84

Loki Operator for on-cluster deployment and management

XMLWordPrintable

    • 3
    • False
    • False

      Goals

      Provide a Loki operator that is capable of providing an on-cluster solution.

      It should be able to install, update, and manage a cluster. This includes providing alerting rules and playbooks to help customers keep Loki healthy and performant.

      Non-Goals

      This will not be a general purpose Loki operator, the (current) intention is for use with the Cluster Logging Operator on-cluster.

      Motivation

      In anticipation of winding down support of the Elasticsearch storage engine (due to anticipated use cases), we need to offer an alternative log store for customers to support their intended use cases of longer term storage that is less resource intensive.

      Alternatives

      Continue to use Elasticsearch

      Acceptance Criteria

      • Verify that the Loki operator is able to install an on-cluster Loki cluster for logs to flow in to
      • Verify that the Loki operator is able to make updates to the cluster configuration once installed (e.g. updates)
      • Verify that the Loki operator provides alerting rules (and dashboards?) so that customers can keep Loki healthy and intervene to tune when necessary

      Risk and Assumptions

      • We want to ensure that we do not make the Loki operator available until we are sure it will be able to provide a nicer UX than what is currently experienced with Elasticsearch
      • Being able to properly test the clusters at scale may be difficult to do before its GA
        • Looking into finding TP customers (that have the understanding this is TP)
        • Ensure that the operator is hardened enough for TP is another risk
      • Initial release for Loki operator will not leverage caching so it may not be as performant (but it simplifies the initial iteration)

      Documentation Considerations

      Given the Loki and Elasticsearch use cases are slightly different (longer term storage vs simple log aggregation with filtering/querying) we should be sure to outline the differences for users so that they understand them.

      [rkratky] This is quite possibly covered as part of other tasks, but leaving here just to make sure.

      Open Questions

      • Planned feature release?
      • Do we need to match the versioning of the current logging components or can we start the operator at a 1.0 release?
      • How do we get the Loki component images built in CPaaS and provided for upstream installs
      • Do we need to have this ready when the Grafana pilot ends?

              Unassigned Unassigned
              rkratky@redhat.com Robert Krátký (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: