Uploaded image for project: 'Hybrid Cloud Console'
  1. Hybrid Cloud Console
  2. RHCLOUD-42392

Spike: OpenTelemetry Enhancements for Server Streaming GRPC endpoints

XMLWordPrintable

    • Product / Portfolio Work
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • Unset
    • None

      Currently monitoring issues or latency on server streaming endpoints (ReadTuples, LookupResources, LookupSubjects) is limited with the current available metrics. While requests to these endpoints are captured in the request metrics (server_requests_seconds_bucket_seconds_bucket), its only capturing the amount of time it takes to initiate a response but does not capture more intricate metrics such as

      Time to first message → important if the consumer expects quick feedback.
      Inter-message latency → distribution of time gaps between messages.
      Total stream duration → from open to close.
      Points of Failure → where are streaming requests failing if in flight

      OpenTelemetry has some options that may help fill those gaps such as Per-Call metrics or perhaps configuring Tracing or Spans. This spike should look into what work is required to add this data and help get a better picture of latency issues or failures with streaming endpoints

              Unassigned Unassigned
              anatale.openshift Antony Natale
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: