Uploaded image for project: 'Managed Service - Streams'
  1. Managed Service - Streams
  2. MGDSTRM-8798

Canary incorrectly records producer/end-to-end latency owing to use of shared client (and thus shared connection) for publish/consumer

XMLWordPrintable

    • MK - Sprint 221

      WHAT

      As discovered by MGDSTRM-8698 (https://github.com/strimzi/strimzi-canary/issues/188), the canary is incorrectly recording message latencies.    The latency is being skewed up to the consumer max wait time.

      This is happening as internally the canary is sharing a single client for both produce and consume sides, so there is a single connection to each broker.  If the canary produces a message whilst a fetch response is still pending, the response to the produce cannot be heard until the fetch response completes.

      WHY

      This is impactful to RHOSAK as it uses end to end message latencies for service alerts and we also desire to expose a internal SLI for message latency.

      HOW

      Fix the canary to use a separate Sarama clients for producing and consuming.  Ping strimzi team to make sure it's clear we're going to work on it in case someone already started on a fix.

       

              keithbwall Keith Wall
              keithbwall Keith Wall
              Kafka Integrations
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: