-
Story
-
Resolution: Unresolved
-
Normal
-
None
-
Product / Portfolio Work
-
False
-
-
False
-
None
-
Unset
-
None
-
-
-
Currently monitoring issues or latency on server streaming endpoints (ReadTuples, LookupResources, LookupSubjects) is limited with the current available metrics. While requests to these endpoints are captured in the request metrics (server_requests_seconds_bucket_seconds_bucket), its only capturing the amount of time it takes to initiate a response but does not capture more intricate metrics such as
Time to first message → important if the consumer expects quick feedback.
Inter-message latency → distribution of time gaps between messages.
Total stream duration → from open to close.
Points of Failure → where are streaming requests failing if in flight
OpenTelemetry has some options that may help fill those gaps such as Per-Call metrics or perhaps configuring Tracing or Spans. This spike should look into what work is required to add this data and help get a better picture of latency issues or failures with streaming endpoints
- is Informed by
-
RHCLOUD-41472 [SPIKE] Latency SLOs for streaming endpoints
-
- Closed
-