Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18656

AlertmanagerConfig with missing options causes Alertmanager to crash

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.13.0, 4.12.0
    • Monitoring
    • None
    • Moderate
    • No
    • MON Sprint 242, MON Sprint 263, MON Sprint 264
    • 3
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      AlertmanagerConfig with missing options causes Alertmanager to crash

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      Always

      Steps to Reproduce:

      A cluster administrator has enabled monitoring for user-defined projects.
      CMO 
      
      ~~~
       config.yaml: |
          enableUserWorkload: true
          prometheusK8s:
            retention: 7d
      ~~~
      
      A cluster administrator has enabled alert routing for user-defined projects. 
      
      UWM cm / CMO cm 
      
      ~~~
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: user-workload-monitoring-config
        namespace: openshift-user-workload-monitoring
      data:
        config.yaml: |
          alertmanager:
            enabled: true 
            enableAlertmanagerConfig: true
      ~~~
      
      verify existing config: 
      
      ~~~
      $ oc exec -n openshift-user-workload-monitoring alertmanager-user-workload-0 -- amtool config show --alertmanager.url http://localhost:9093  
      global:
        resolve_timeout: 5m
        http_config:
          follow_redirects: true
        smtp_hello: localhost
        smtp_require_tls: true
        pagerduty_url: https://events.pagerduty.com/v2/enqueue
        opsgenie_api_url: https://api.opsgenie.com/
        wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/
        victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/
        telegram_api_url: https://api.telegram.org
      route:
        receiver: Default
        group_by:
        - namespace
        continue: false
      receivers:
      - name: Default
      templates: []
      ~~~
      
      create alertmanager config without options "smtp_from:" and "smtp_smarthost"
      
      ~~~
      apiVersion: monitoring.coreos.com/v1alpha1
      kind: AlertmanagerConfig
      metadata:
        name: example
        namespace: example-namespace
      spec:
        receivers:
          - emailConfigs:
              - to: some.username@example.com
            name: custom-rules1
        route:
          matchers:
            - name: alertname
          receiver: custom-rules1
          repeatInterval: 1m
      ~~~
      
      check logs for alertmanager: the following error is seen. 
      
      ~~~
      ts=2023-09-05T12:07:33.449Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config_out/alertmanager.env.yaml err="no global SMTP smarthost set"
      ~~~ 

      Actual results:

      Alertmamnager fails to restart.

      Expected results:

      CRD should be pre validated.

      Additional info:

      Reproducible with and without user workload Alertmanager.

              janantha@redhat.com Jayapriya Pai
              rhn-support-krg Kruthika G
              Junqi Zhao Junqi Zhao
              Simon Pasquier
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: