Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-5381

Implement Alerts and Metrics Dashboard for Vector Output Buffer

XMLWordPrintable

    • 3
    • False
    • None
    • False
    • NEW
    • NEW
    • Hide
      This feature introduces an alert to trigger when log collector's are buffering logs to a cluster node's file system and the buffers are consuming more then 15% of the available space. This is a possible indicator of the log collectors experiencing back pressure from their configured outputs and that administrators should take action to keep the collectors from potentially destabilizing the cluster node
      Show
      This feature introduces an alert to trigger when log collector's are buffering logs to a cluster node's file system and the buffers are consuming more then 15% of the available space. This is a possible indicator of the log collectors experiencing back pressure from their configured outputs and that administrators should take action to keep the collectors from potentially destabilizing the cluster node
    • Feature
    • Log Collection - Sprint 252, Log Collection - Sprint 253

      Summary

      Add Alerts that should be fired in case collectors potentially consuming too much node disk space.
      Add Metrics Dashboard that will be show consuming space Vector Output Buffer on each node.

      Acceptance Criteria

      1. Implement Alerting:

      • Set up alerting rules to trigger alerts when the space consumed by the Vector Output Buffer exceeds 15% of the total disk space on each node.
      • Test the alerting system to ensure that alerts are fired appropriately when the criteria are met.

      2. Create Grafana Dashboards:

      • Panel 1: Display the absolute size of the Vector buffer via a graph by instance.
      • Panel 2: Display the percentage of buffer size relative to the total disk space on the node.

          Configure the panels in each dashboard to visualize the required metrics accurately.

      3. Update Documentation

      Notes

      • We must understand if it is possible for these alerts to be enabled for non-infra namespaces and how to do that. I believe there is no way for non-infra namespaces to be opted into cluster metrics

              vparfono Vitalii Parfonov
              vparfono Vitalii Parfonov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: