-
Task
-
Resolution: Done
-
Major
-
None
-
3
-
---
-
---
Goals
Provide a Loki operator that is capable of providing an on-cluster solution.
It should be able to install, update, and manage a cluster. This includes providing alerting rules and playbooks to help customers keep Loki healthy and performant.
Non-Goals
This will not be a general purpose Loki operator, the (current) intention is for use with the Cluster Logging Operator on-cluster.
Motivation
In anticipation of winding down support of the Elasticsearch storage engine (due to anticipated use cases), we need to offer an alternative log store for customers to support their intended use cases of longer term storage that is less resource intensive.
Alternatives
Continue to use Elasticsearch
Acceptance Criteria
- Verify that the Loki operator is able to install an on-cluster Loki cluster for logs to flow in to
- Verify that the Loki operator is able to make updates to the cluster configuration once installed (e.g. updates)
- Verify that the Loki operator provides alerting rules (and dashboards?) so that customers can keep Loki healthy and intervene to tune when necessary
Risk and Assumptions
- We want to ensure that we do not make the Loki operator available until we are sure it will be able to provide a nicer UX than what is currently experienced with Elasticsearch
- Being able to properly test the clusters at scale may be difficult to do before its GA
- Looking into finding TP customers (that have the understanding this is TP)
- Ensure that the operator is hardened enough for TP is another risk
- Initial release for Loki operator will not leverage caching so it may not be as performant (but it simplifies the initial iteration)
Documentation Considerations
Given the Loki and Elasticsearch use cases are slightly different (longer term storage vs simple log aggregation with filtering/querying) we should be sure to outline the differences for users so that they understand them.
Open Questions
- Planned feature release?
- Do we need to match the versioning of the current logging components or can we start the operator at a 1.0 release?
- How do we get the Loki component images built in CPaaS and provided for upstream installs
- Do we need to have this ready when the Grafana pilot ends?
Additional Notes
- documents
-
LOG-1797 Loki Operator GA
- Closed
- is related to
-
RHDEVDOCS-3138 Initial creation and work of the Loki Operator for on-cluster deployment and management
- Closed
- relates to
-
RHDEVDOCS-4064 Cluster-Logging Loki Integration
- Closed