Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-1918

Alert `FluentdNodeDown` always firing

    XMLWordPrintable

Details

    • False
    • False
    • NEW
    • OBSDA-108 - Distribute an alternate Vector Log Collector
    • VERIFIED
    • Before this update, a name change of the deployed collector in the 5.3 release caused the alert 'fluentnodedown' to generate.
    • Logging (Core) - Sprint 209

    Description

      Description of problem:

      The alert `FluentdNodeDown` is always firing. The rule is:

          - alert: FluentdNodeDown
            annotations:
              message: Prometheus could not scrape fluentd {{ $labels.instance }} for more
                than 10m.
              summary: Fluentd cannot be scraped
            expr: |
              absent(up{job="fluentd"} == 1)
            for: 10m
            labels:
              service: fluentd
              severity: critical

      In 5.3, the collector name is changed from `fluentd` to `collector`, maybe the expr should be `absent(up{job="collector"} == 1)`

      Version-Release number of selected component (if applicable):

      cluster-logging.5.3.0-46 

      How reproducible:

      Always

      Steps to Reproduce:
      1. deploy logging 5.3
      2. check alerts in openshift console
      3.

      Actual results:

      Expected results:

      Additional info: 

      Attachments

        Activity

          People

            jcantril@redhat.com Jeffrey Cantrill
            qitang@redhat.com Qiaoling Tang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: