Uploaded image for project: 'Hybrid Cloud Console'
  1. Hybrid Cloud Console
  2. RHCLOUD-43567

Fix Kafka Consumer Rebalance Lock Acquisition Issues

XMLWordPrintable

    • Product / Portfolio Work
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • Unset
    • None
    • 3

      Issue 1: Infinite Blocking During Rebalance

      The on_partitions_assigned callback attempted to acquire a lock token from SpiceDB via gRPC
      without any timeout. When SpiceDB was slow or unavailable:

      • gRPC call would hang indefinitely (no timeout)
      • Callback blocked for 12+ seconds
      • Messages eventually processed WITHOUT a valid lock token
      • Violated fencing mechanism designed to prevent split-brain scenarios

      Issue 2: Infinite Retry Loop

      When messages were processed without a lock token:

      • Consumer retried the message indefinitely
      • Eventually hit max retries and entered infinite sleep loop
      • Pod never crashed, requiring manual intervention
      • Retrying was pointless as lock token wouldn't appear without restart

       

              lpichler@redhat.com Libor Pichler
              lpichler@redhat.com Libor Pichler
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: