Uploaded image for project: 'Managed Service - Streams'
  1. Managed Service - Streams
  2. MGDSTRM-10909

KafkaBrokerStorageQuotaExceeded is fired due to double counted space usage

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • None
    • MK - Sprint 234

      For that KafkaBrokerStorageQuotaExceeded critical alert that fired in the last 7 days, here's the PV usage dashboard for the affected instance over the past 7 days: https://grafana.app-sre.devshift.net/d/Q2NXRP8nk/usage-and-limits-2-persistent-volumes?orgId=1&var-cluster_id=7694399d-2170-4139-ae6c-79c637b28669&var-namespace=kafka-caorpfsgebb0ffrjq940&var-pvc=data-0-internal-data-pipeline-kafka-0&var-pvc=data-0-internal-data-pipeline-kafka-1&var-pvc=data-0-internal-data-pipeline-kafka-2&var-pvc=data-internal-data-pipeline-zookeeper-0&var-pvc=data-internal-data-pipeline-zookeeper-1&var-pvc=data-internal-data-pipeline-zookeeper-2

      I think this probably aligns with an OSD upgrade, where somehow the metrics were counted twice for a short period of time. As this instance uses more than 50% of the available storage, the doubling brought the "used" space above the limit. There's a corresponding doubling of the "free" space at the same times as the "used" space on each occasion for each PV.

      All of the other Kafka clusters on that OSD cluster had similar doubling spikes for their used and free disk space, but they wouldn't cause the alert to fire since they all use less than 0.1% of their available storage.

              lukchen@redhat.com Luke Chen
              lukchen@redhat.com Luke Chen
              Kafka Integrations
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: