-
Task
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
None
-
2
-
False
-
False
-
NEW
-
OBSDA-7 - Adopting Loki as an alternative to Elasticsearch to support more lightweight, easier to manage/operate storage scenarios
-
NEW
-
Undefined
-
Story
As an OpenShift user, I'd like to configure OpenShift Logging to forward logs to my own Loki instance
so that I can route log messages according to my business rules.
Acceptance Criteria/DoD
- Log forwarding CRD API exposes output type for Loki ("type: Loki") with the following configuration options:
- Tenant as endpoint URI path. One tenant per OutputRef
- Server-side TLS support only
- Translation of fluentd fields to Loki labels.
- Any log messages from a particular source, must be forward to a remote Loki configured inside the output section.
- OpenShift Logging translate Loki configuration into a valid fluentd config.
- Log messages without fields: docker., kubernetes., pipeline_metadata.*
- Documentation describing what is (is not ) configurable.
Open Questions
What subset of the logging data model (meta-data) do we want to present as loki labels?
Loki has a restrictive limit on labels (15 per stream?) and does not do well if the label combinations yields a very high cardinality.
For correlation we need at least basic data on the origin of the logs:
- container sourced logs: cluster name, namespace name/uid, pod name/uid, container name
- node-sourced logs at least: cluster name, node id, log type
- log type (application, infra, audit) for all logs.
We've now used 7 of the 15 label slots for basic origin info.
Unfortunately these don't help much with narrowing a log search except when
* you want to narrow by namespace
* you have already identified a set of pods that are of interest (e.g. correlating with trace or metric data that pin-pointed some pods.)
Can we really ignore kubernetes labels as Loki labels?
Advice from Grafana is: don't put all the k8s labels into loki lables, they tried it and regretted it.
Given our correlation needs above we only have 8 slots or so left to work with so we can't anyway.
But if applications are distributed across multiple namespaces, labels may be the only way to identify them.
So: do we need to enhance the loki output API to allow the user to nominate small sets of "important" labels?
What goes in the log body?
Seems likely we can just dump our existing JSON log record (with metadata) as the loki body. JSON is reasonably structured and popular in the logging world..
To investigate: is JSON the best format for this or does loki favour other formats?
Providing alternate formats, is probably out of scope here. Alternate payload formats can be dealt with as a generic forwarder feature that applies to all outputs.