Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-360

[2260344] There is a pending alert CephClusterWarningState for a brief time with a timestamp of firing alert when firing alert appears

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • odf-4.18
    • odf-4.13
    • ceph-monitoring
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Committed
    • ?
    • If docs needed, set a value
    • Low
    • None

      Description of problem (please be detailed as possible and provide log
      snippests):
      During testing of CephClusterWarningState alert (in stop 1 osd scenario) is observed following behaviour:

      • alert is raised correctly and in pending state
      • when the alert should be turned into `firing` state, there is a `pending` alert with changed timestamp
      • there is a `firing` alert with a correct timestamp

      Version of all relevant components (if applicable):
      ocs 4.13.6-1

      Can this issue reproducible?
      (2/2)
      https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/10153/testReport/junit/tests.manage.monitoring.prometheus/test_deployment_status/test_ceph_osd_stopped/
      https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/10153/testReport/junit/tests.manage.monitoring.prometheus/test_deployment_status/test_ceph_osd_stopped/

      Steps to Reproduce:
      1. Downscale an osd deployment
      2. Monitor incoming alerts

      Actual results:
      Collected alerts:

      {'labels':

      {'alertname': 'CephClusterWarningState', 'container': 'mgr', 'endpoint': 'http-metrics', 'instance': '172.17.174.59:9283', 'job': 'rook-ceph-mgr', 'managedBy': 'ocs-storagecluster', 'namespace': 'openshift-storage', 'pod': 'rook-ceph-mgr-a-79989c4657-w48qm', 'service': 'rook-ceph-mgr', 'severity': 'warning'}

      , 'annotations':

      {'description': 'Storage cluster is in warning state for more than 15m.', 'message': 'Storage cluster is in degraded state', 'severity_level': 'warning', 'storage_type': 'ceph'}

      , 'state': 'pending', 'activeAt': '2023-12-22T20:58:22.229278715Z', 'value': '1e+00'},

      {'labels':

      {'alertname': 'CephClusterWarningState', 'container': 'mgr', 'endpoint': 'http-metrics', 'instance': '172.17.174.59:9283', 'job': 'rook-ceph-mgr', 'managedBy': 'ocs-storagecluster', 'namespace': 'openshift-storage', 'pod': 'rook-ceph-mgr-a-79989c4657-w48qm', 'service': 'rook-ceph-mgr', 'severity': 'warning'}

      , 'annotations':

      {'description': 'Storage cluster is in warning state for more than 15m.', 'message': 'Storage cluster is in degraded state', 'severity_level': 'warning', 'storage_type': 'ceph'}

      , 'state': 'pending', 'activeAt': '2023-12-22T21:27:52.229278715Z', 'value': '1e+00'},

      {'labels':

      {'alertname': 'CephClusterWarningState', 'container': 'mgr', 'endpoint': 'http-metrics', 'instance': '172.17.174.59:9283', 'job': 'rook-ceph-mgr', 'managedBy': 'ocs-storagecluster', 'namespace': 'openshift-storage', 'pod': 'rook-ceph-mgr-a-79989c4657-w48qm', 'service': 'rook-ceph-mgr', 'severity': 'warning'}

      , 'annotations':

      {'description': 'Storage cluster is in warning state for more than 15m.', 'message': 'Storage cluster is in degraded state', 'severity_level': 'warning', 'storage_type': 'ceph'}

      , 'state': 'firing', 'activeAt': '2023-12-22T21:27:52.229278715Z', 'value': '1e+00'}

      Expected results:
      There should be collected just 2 alerts - 1 pending and 1 firing.

      Additional info:

              aruniiird Arun Kumar Mohan
              fbalak Filip Balak
              Harish Nallur Vittal Rao Harish Nallur Vittal Rao
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: