Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-55759

[4.17] CNV PodDisruptionBudgetAtLimit Silence Only Silences the First Alert

XMLWordPrintable

    • CNV I/U Operators Sprint 267
    • None

      Description of problem:

      CNV 4.17.3 / OCP 4.17.10 and this alert is constant still... there are 397 of them firing in the last insights report:

      cat ocp/5866595/b78eb09e-a2c0-4bdb-af29-5ae974612241/20250131160428-a4ee2e651a5d4667ac53881b968b3256/config/alerts.json | jq '.[] | select(.labels.alertname == "PodDisruptionBudgetAtLimit") | .labels.alertname' | wc -l
      379
      

      They are all from when the VMs were deployed on Jan 22 or 23:

      {
        "labels": {
          "alertname": "PodDisruptionBudgetAtLimit",
          "namespace": "vmtest",
          "openshift_io_alert_source": "platform",
          "poddisruptionbudget": "kubevirt-disruption-budget-b7mx4",
          "prometheus": "openshift-monitoring/k8s",
          "severity": "warning"
        },
        "annotations": {
          "description": "The pod disruption budget is at the minimum disruptions allowed level. The number of current healthy pods is equal to the desired healthy pods.",
          "runbook_url": "https://github.com/openshift/runbooks/blob/master/alerts/cluster-kube-controller-manager-operator/PodDisruptionBudgetAtLimit.md",
          "summary": "The pod disruption budget is preventing further disruption to pods."
        },
        "endsAt": "2025-01-31T16:01:56.284Z",
        "startsAt": "2025-01-23T16:21:26.284Z",
        "updatedAt": "2025-01-31T15:57:56.345Z",
        "status": {
          "inhibitedBy": [],
          "silencedBy": [],
          "state": "active"
        }
      

      The operator is logging once per hour that the alert is already silenced:

      ./hco-operator-598f4b7cbc-d7lb4/hyperconverged-cluster-operator/hyperconverged-cluster-operator/logs/current.log:2025-01-22T16:43:34.664979676Z {"level":"info","ts":"2025-01-22T16:43:34Z","logger":"controller_observability","msg":"KubeVirt PodDisruptionBudgetAtLimit alerts are already silenced"}
      

      It looks like it silenced the first alert and nothing after.

      ./hco-operator-598f4b7cbc-d7lb4/hyperconverged-cluster-operator/hyperconverged-cluster-operator/logs/previous.log:2025-01-13T18:21:26.255441563Z {"level":"info","ts":"2025-01-13T18:21:26Z","logger":"controller_observability","msg":"Silenced PodDisruptionBudgetAtLimit alerts"}
      

      Version-Release number of selected component (if applicable):

      
      CNV 4.17.3 / OCP 4.17.10
      
      

      How reproducible:

      Just deployed a bunch of VMs across the cluster. PBD are auto crated and not silenced.
      

      Actual results:

      
      379 alerts
      
      

      Expected results:

      No alerts
      

              jvilaca@redhat.com João Vilaça
              rhn-support-mrobson Matt Robson
              Krzysztof Majcher Krzysztof Majcher
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: