-
Bug
-
Resolution: Done
-
Normal
-
Logging 5.5
-
None
-
OBSDOCS (Oct 2 - Oct 23) #243
Able to see in the current alert configuration of FluentdQueueLengthIncreasing the severity and for is defined as warning and 1h however, in the documentation [1] it seems to be critical and 12h.
----------------------------
5.5
https://github.com/openshift/cluster-logging-operator/blob/release-5.5/files/fluentd/fluentd_prometheus_alerts.yaml
- "alert": "FluentdQueueLengthIncreasing"
"annotations":
"message": "For the last hour, fluentd {{ $labels.instance }} output '{{ $labels.plugin_id }}' average buffer queue length has increased continuously."
"summary": "Fluentd is unable to keep up with traffic over time for forwarder output {{ $labels.plugin_id }}."
"expr": |
( 0 * (deriv(fluentd_output_status_emit_records[1m] offset 1h))) + on(pod,plugin_id) ( deriv(fluentd_output_status_buffer_queue_length[10m]) > 0 and delta(fluentd_output_status_buffer_queue_length[1h]) > 1 )
"for": "1h"
"labels":
"service": "fluentd"
"severity": "Warning"
namespace: "openshift-logging"
----------------------------
This should be corrected as per the alert.
[1] https://docs.openshift.com/container-platform/4.10/logging/troubleshooting/cluster-logging-alerts.html#cluster-logging-colle[…]luster-logging-alerts