Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-58408

Backport of Prometheus Operator Bug Fix: "One Alertmanager Config failing blocks all others" to OCP 4.17

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.16.z
    • Monitoring
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 1
    • None
    • None
    • None
    • Mon Sprint 273, MON Sprint 274, MON Sprint 275, MON Sprint 276, MON Sprint 277, MON Spring 278
    • 6
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      When an invalid URL in an Alertmanager Config: the following logs are present in the prometheus-user-workload pods

      26 jun 2025, 12:13:57.796 |   level=error ts=2025-06-26T10:13:57.785124884Z caller=klog.go:126 component=k8s\_client\_runtime func=ErrorDepth msg="sync \\"openshift-user-workload-monitoring/user-workload\\" failed: provision alertmanager configuration: failed to generate Alertmanager configuration: AlertmanagerConfig XXXXX/XXXXX: SlackConfig\[0\]: invalid URL \\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\" in key \\"api\_url\\" from secret \\"XXXXXXXXXXXX\\": validate url from string failed for xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: unsupported scheme \\"\\" for URL"
          

      With alertmanager enabled for user workload monitoring. This was reported using Slack receiver in alertmanager config

      Environment:

      User-workload-monitoring, Alertmanager configuration.

      Important Notes:
      This is an issue with present in Prometheus operator prior to - 0.80.0, for which a fix was implemented: , which is now present in OCP 4.19.

      Impact

      This issue is impacting customers for example using OCP clusters pre-4.19, especially those with Extended Update support, they remain impacted by this issue for the duration of their support without changing or upgrading to a 4.19 cluster.

      Version-Release number of selected component (if applicable):

          4.16.z

      How reproducible:

          Easily reproducible on a 4.16 cluster, 

      This is a request for this fix from  prometheus operator 0.80.0 to be backported to earlier OCP versions

              janantha@redhat.com Jayapriya Pai
              rhn-support-ccostell Cormac Costello
              None
              None
              Junqi Zhao Junqi Zhao
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: