-
Epic
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
remove-write telemetry
-
False
-
None
-
False
-
Not Selected
-
NEW
-
To Do
-
NEW
-
100% To Do, 0% In Progress, 0% Done
The motivation stays the same from the first effort to move telemetry to remote write:
Goal: Replace the current deployment of telemeter client in favor of Prometheus Remote Write.
Problem:
- Telemeter client is a RedHat maintained piece of software, sending telemetry data. The backend part (today called Telemeter server) is being replaced by now with the Observatorium infrastructure (formerly known as Telemeter v2).
Observatorium heavily relies on Thanos including the recently added feature to accept Prometheus remote write requests.- Effectively, this renders Telemeter Client obsolete and unnecessary, as we can leverage Prometheus' native remote write functionality for telemetry data.
Why is this important:
The later we migrate to Prometheus Remote Write, the longer we have to maintain Telemeter client. This implies a big maintenance burden.
...
Some details have changed, but overall we consider it beneficial to switch telemetry to remote-write. Two attempts were made but ultimately the changes were reverted in favor of telemeter client over concerns about data volume when remote-writing from the in-cluster Prometheus.
We have two reasons to attempt this yet again:
The telemetry receiver side has received significant scaling improvments. Rumors have it it might even support telemetry data at OCP scrape frequency, though this needs to be confirmed and verified.
We are tasked with implementing Optional in-cluster monitoring, while retaining the ability to send telemetry. This means we need to able to send telemetry in the absence of a fully fledged Prometheus instance. Telemeter client will likely not be the technology to implement this. While Optional In-cluster monitoring will require additional work, switching to remote-write is a prerequisite.
The minimum viable definition of done here is simply to switch telemetry to the remote write protocol. Possibly by deploying an additional Prometheus instance, that federates from in-cluster.
- is related to
-
MON-4101 Explore future telemetry architectures
- New