Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29180

unable to use `continue: true` in user-defined AlertmanagerConfig

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Major Major
    • 4.15.0
    • 4.12
    • Monitoring

      This is a clone of issue OCPBUGS-28251. The following is the description of the original issue:

      Description of problem:

      Trying to define multiple receivers in a single user-defined AlertmanagerConfig

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      always   

      Steps to Reproduce:

      #### Monitoring for user-defined projects is enabled
      ```
      oc -n openshift-monitoring get configmap cluster-monitoring-config -o yaml | head -4
      ```
      ```
      apiVersion: v1
      data:
        config.yaml: |
          enableUserWorkload: true
      ```
      
      #### separate Alertmanager instance for user-defined alert routing is Enabled and Configured
      ```
      oc -n openshift-user-workload-monitoring get configmap user-workload-monitoring-config -o yaml | head -6
      ```
      ```
      apiVersion: v1
      data:
        config.yaml: |
          alertmanager:
            enabled: true
            enableAlertmanagerConfig: true
      ```
      create testing namespace 
      oc new-project libor-alertmanager-testing 
      ```
      ## TESTING - MULTIPLE RECEIVERS IN ALERTMANAGERCONFIG
      Single AlertmanagerConfig
      `alertmanager_config_webhook_and_email_rootDefault.yaml`
      ```
      apiVersion: monitoring.coreos.com/v1beta1
      kind: AlertmanagerConfig
      metadata:
        name: libor-alertmanager-testing-email-webhook
        namespace: libor-alertmanager-testing
      spec:
        receivers:
        - name: 'libor-alertmanager-testing-webhook'
          webhookConfigs:
            - url: 'http://prometheus-msteams.internal-monitoring.svc:2000/occ-alerts'
        - name: 'libor-alertmanager-testing-email'
          emailConfigs:
            - to: USER@USER.CO
              requireTLS: false
              sendResolved: true
        - name: Default
        route:
          groupBy:
          - namespace
          receiver: Default
          groupInterval: 60s
          groupWait: 60s
          repeatInterval: 12h
          routes:
          - matchers:
            - name: severity
              value: critical
              matchType: '='
              continue: true
            receiver: 'libor-alertmanager-testing-webhook'
          - matchers:
            - name: severity
              value: critical
              matchType: '='
            receiver: 'libor-alertmanager-testing-email'
      ```
      Once saved the continue statement is removed from the object. 
      ```
      the configuration applied to alertmanager contains continue false statements
      ```
      oc exec -n openshift-user-workload-monitoring alertmanager-user-workload-0 -- amtool config show --alertmanager.url http://localhost:9093 
      
      ```
      route:
        receiver: Default
        group_by:
        - namespace
        continue: false
        routes:
        - receiver: libor-alertmanager-testing/libor-alertmanager-testing-email-webhook/Default
          group_by:
          - namespace
          matchers:
          - namespace="libor-alertmanager-testing"
          continue: true
          routes:
          - receiver: libor-alertmanager-testing/libor-alertmanager-testing-email-webhook/libor-alertmanager-testing-webhook
            matchers:
            - severity="critical"
            continue: false  <----
          - receiver: libor-alertmanager-testing/libor-alertmanager-testing-email-webhook/libor-alertmanager-testing-email
            matchers:
            - severity="critical"
            continue: false <-----
      ```
      If I update the statements to read `continue: true` 
      and test here: https://prometheus.io/webtools/alerting/routing-tree-editor/ 
      
      then I get the desired results
      
      workaround is to use 2 separate files - the continue statement is being added. 
      
      

      Actual results:

      Once saved the continue statement is removed from the object. 

      Expected results:

      continue true statement is retain and applied to alertmanager 

      Additional info:

          

              spasquie@redhat.com Simon Pasquier
              openshift-crt-jira-prow OpenShift Prow Bot
              Junqi Zhao Junqi Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: