Loading...

XML

Word

Printable

Type: Epic
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- groomed

Epic Name:
agent vs server resources
Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Docs QE Status:
NEW
Epic Status:
To Do
Parent Link:
MON-3156Upstream improvements
QE Status:
NEW

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Intelligence Requested:
Market:

For telemetry only use cases it seems desirable to use Prometheus agent mode, since on the surface we only want to forward data.

This however comes with the big trade-off of not being able to deploy recording rules (and alerts). With an agent strategy for telemetry sending we would have to move all aggregation (normally handled by recording rules) and down sampling* to the telemeter receiving side. This could massively increase resource usage on the telemeter side (however not the topic of this research).

The goal of this epic is to quantify the difference in prometheus resource usage for two scenarios:

A prometheus agent scraping data and remote writing that data to a target at scrape frequency.
A prometheus server scraping data and evaluating recording rules at scrape frequency.

The dataset scraped should be fairly small, to mimic the telemetry use case, not more then a few thousands of time series. The server deployment should have a very short retention period (2h). It could be worth it to configure smaller -storage.tsdb.min-block-duration and -storage.tsdb.max-block-duration for the server instance.

Down sampling here refers to the the difference between the telemetry interval (5m) and the scrape interval (usually 30s). To get accurate recording rule results data has to be present at a higher resolution compared to the telemetry resolution.

COO can be used to accomplish this, since we install the prometheusagents CRD. One only has to create a PrometheusAgent CR manually.

is related to

MON-3872 Send OCP telemetry via Prometheus remote-write

MON-4101 Explore future telemetry architectures

Assignee:: Unassigned

Reporter:: Jan Fajerski

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2024/01/24 3:26 PM

Updated:: 2024/12/10 6:45 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates