Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-48050

[release-4.18] AlertmanagerConfig with missing options causes Alertmanager to crash

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.13.0
    • Monitoring
    • Moderate
    • No
    • MON Sprint 264
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      Before this fix, if SMTP smarthost or SMTP from fields under EmailConfigs are not specified at global nor receiver level in Alertmanager Config, it will cause alertmanager to crash as they are required fields.

      With this fix, prometheus-operator will fail reconciliation if these fields are not specified neither at global nor at receiver level hence avoids pushing invalid config to Alertmanager.
      Show
      Before this fix, if SMTP smarthost or SMTP from fields under EmailConfigs are not specified at global nor receiver level in Alertmanager Config, it will cause alertmanager to crash as they are required fields. With this fix, prometheus-operator will fail reconciliation if these fields are not specified neither at global nor at receiver level hence avoids pushing invalid config to Alertmanager.
    • Bug Fix
    • In Progress

      Description of problem:

      AlertmanagerConfig with missing options causes Alertmanager to crash

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      Always

      Steps to Reproduce:

      A cluster administrator has enabled monitoring for user-defined projects.
      CMO 
      
      ~~~
       config.yaml: |
          enableUserWorkload: true
          prometheusK8s:
            retention: 7d
      ~~~
      
      A cluster administrator has enabled alert routing for user-defined projects. 
      
      UWM cm / CMO cm 
      
      ~~~
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: user-workload-monitoring-config
        namespace: openshift-user-workload-monitoring
      data:
        config.yaml: |
          alertmanager:
            enabled: true 
            enableAlertmanagerConfig: true
      ~~~
      
      verify existing config: 
      
      ~~~
      $ oc exec -n openshift-user-workload-monitoring alertmanager-user-workload-0 -- amtool config show --alertmanager.url http://localhost:9093  
      global:
        resolve_timeout: 5m
        http_config:
          follow_redirects: true
        smtp_hello: localhost
        smtp_require_tls: true
        pagerduty_url: https://events.pagerduty.com/v2/enqueue
        opsgenie_api_url: https://api.opsgenie.com/
        wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/
        victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/
        telegram_api_url: https://api.telegram.org
      route:
        receiver: Default
        group_by:
        - namespace
        continue: false
      receivers:
      - name: Default
      templates: []
      ~~~
      
      create alertmanager config without options "smtp_from:" and "smtp_smarthost"
      
      ~~~
      apiVersion: monitoring.coreos.com/v1alpha1
      kind: AlertmanagerConfig
      metadata:
        name: example
        namespace: example-namespace
      spec:
        receivers:
          - emailConfigs:
              - to: some.username@example.com
            name: custom-rules1
        route:
          matchers:
            - name: alertname
          receiver: custom-rules1
          repeatInterval: 1m
      ~~~
      
      check logs for alertmanager: the following error is seen. 
      
      ~~~
      ts=2023-09-05T12:07:33.449Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config_out/alertmanager.env.yaml err="no global SMTP smarthost set"
      ~~~ 

      Actual results:

      Alertmamnager fails to restart.

      Expected results:

      CRD should be pre validated.

      Additional info:

      Reproducible with and without user workload Alertmanager.

              janantha@redhat.com Jayapriya Pai
              rhn-support-krg Kruthika G
              Junqi Zhao Junqi Zhao
              Simon Pasquier
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: