Uploaded image for project: 'Observability Documentation'
  1. Observability Documentation
  2. OBSDOCS-96

FluentdQueueLengthIncreasing alert description is wrong

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • Logging 5.8
    • Logging 5.5
    • Logging
    • None
    • OBSDOCS (Oct 2 - Oct 23) #243

      Able to see in the current alert configuration of FluentdQueueLengthIncreasing the severity and for is defined as warning and 1h however, in the documentation [1] it seems to be critical and 12h.
       
      ----------------------------
      5.5
      https://github.com/openshift/cluster-logging-operator/blob/release-5.5/files/fluentd/fluentd_prometheus_alerts.yaml

      • "alert": "FluentdQueueLengthIncreasing"
        "annotations":
        "message": "For the last hour, fluentd {{ $labels.instance }} output '{{ $labels.plugin_id }}' average buffer queue length has increased continuously."
        "summary": "Fluentd is unable to keep up with traffic over time for forwarder output {{ $labels.plugin_id }}."
        "expr": |
        ( 0 * (deriv(fluentd_output_status_emit_records[1m] offset 1h))) + on(pod,plugin_id) ( deriv(fluentd_output_status_buffer_queue_length[10m]) > 0 and delta(fluentd_output_status_buffer_queue_length[1h]) > 1 )
        "for": "1h"
        "labels":
        "service": "fluentd"
        "severity": "Warning"
        namespace: "openshift-logging"
        ----------------------------
         
        This should be corrected as per the alert.
         [1] https://docs.openshift.com/container-platform/4.10/logging/troubleshooting/cluster-logging-alerts.html#cluster-logging-colle[…]luster-logging-alerts

              abrennan@redhat.com Ashleigh Brennan
              rhn-support-aharchin Akhil Harchinder (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: