-
Feature Request
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
False
-
None
-
False
-
Not Selected
-
-
1. Proposed title of this feature request
Distributed tracing of platform components
2. What is the nature and description of the request?
The capability to configure tracing for platform components including (but not limited to) kube-apiserver, etcd, kubelet. Additionally any future components that are instrumented with OTel supported tracing should be configurable (openshift-apiserver, oauth).
The trace destination must be configurable and could either be consumed by a provided OTel collector or by a $USER provided collector. Depending on the requirement to integrate with the OCP console itself.
Collector should be configurable to enable a centralised observability model.
https://kubernetes.io/docs/concepts/cluster-administration/system-traces/#kube-apiserver-traces
3. Why does the customer need this? (List the business requirements here)
SRE requires the following
- Move to a high cardinality/dimensionality data set that enables POST aggregation rather then pre which is the current form of metrics.
- Enables the ability to query at a wider scope (limited by K:Vs in traces, rather then labels in metrics or data in logs). This aids troubleshooting during an incident.
- Enable high resolution analytics that assists in assessment of nominal performance
4. List any affected packages or components.
- kube-apiserver
- kubelet
- etcd
Reference
Upstream work
- kube apiserver | https://kubernetes.io/docs/concepts/cluster-administration/system-traces/#kube-apiserver-traces
- etcd https://etcd.io/docs/v3.6/op-guide/monitoring/#distributed-tracing | https://github.com/etcd-io/etcd/issues/12460
- kubelet | https://kubernetes.io/docs/concepts/cluster-administration/system-traces/#kubelet-traces
Downstream work in OpenShift would be:
- intrument openshift apiserver
Related moves in this space:
AWS enabling OTel tracing to span customer applications into AWS resources: https://aws.amazon.com/otel/