Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8993

OpenShift Alerting Rules Style-Guide Compliance

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Minor Minor
    • None
    • 4.10
    • OLM
    • Low
    • None
    • Grumpy 241, Happy 242, INKEY$ (OPRUN 243)
    • 3
    • Rejected
    • Unspecified
    • NA
    • Release Note Not Required

      Hello,

      The OpenShift Monitoring Team has published a set guidelines for
      writing alerting rules in OpenShift, including a basic style guide.
      You can find these here:

      https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md
      https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md#style-guide

      A subset of these are now being enforced in OpenShift End-to-End
      tests [1], with temporary exceptions for existing non-compliant rules.

      This component was found to have the following issues:

      • Alerts without summary and/or description annotations:
      • CsvAbnormalFailedOver2Min
      • CsvAbnormalOver30Min
      • InstallPlanStepAppliedWithWarnings

      Alerts MUST include summary and description annotations.

      Think of summary as the first line of a commit message, or an email
      subject line. It should be brief but informative. The description is
      the longer, more detailed explanation of the alert.

      The enhancement document linked above has examples of alerts with
      these annotations.

      • Alerts found to not include a namespace label:
      • InstallPlanStepAppliedWithWarnings

      Alerts SHOULD include a namespace label indicating the alert's source.

      This requirement originally comes from our SRE team, as they use the
      namespace label as the first means of routing alerts. Many alerts
      already include a namespace label as a result of the PromQL
      expressions used, others may require a static label.

      Example of a change to PromQL to include a namespace label:

      https://github.com/openshift/cluster-monitoring-operator/commit/52d1f05#diff-9024dcef0fd244c0267c46858da24fbd1f45633515fafae0f98781b20805ff1dL22-R22

      Example of adding a static namespace label:

      https://github.com/openshift/cluster-monitoring-operator/commit/52d1f05#diff-352702e71122d34a1be04c0588356cd8cb8a10df547f1c3c39fec18fa75b1593R304

      If you have questions about how to best to modify your alerting rules
      to include a namespace label, please reach out to the OpenShift
      Monitoring Team in the #forum-monitoring channel on Slack, or on our
      mailing list: team-monitoring@redhat.com

      Thank you!

      Repo: operator-framework/operator-lifecycle-manager

      [1]: https://github.com/openshift/origin/commit/097e7a6

              spasquie@redhat.com Simon Pasquier
              openshift_jira_bot OpenShift Jira Bot
              bruno andrade bruno andrade
              Red Hat Employee
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

                Created:
                Updated:
                Resolved: