Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.22.0
Component/s: Monitoring
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Low
Regression:
Yes

Target Backport Versions:
None
Target Version:

4.22.0
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

enable UWM

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    enableUserWorkload: true

enable UWM alertmanager and enableAlertmanagerConfig

apiVersion: v1
kind: ConfigMap
data:
  config.yaml: |
    alertmanager:
      enabled: true
      enableAlertmanagerConfig: true
metadata:
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring

create custom AlertmanagerConfig which the secret my-workflow-webhook is missing

$ oc new-project noodles;
$ oc create -f - << eof
apiVersion: monitoring.coreos.com/v1beta1
kind: AlertmanagerConfig
metadata:
  name: example
  namespace: noodles
spec:
  route:
    groupBy:
    - namespace
    receiver: msteams
  receivers:
  - name: msteams
    msteamsConfigs:
    - webhookUrl: 
        key: url # 
        name: my-workflow-webhook # k8s secret name in same namespace as AlertManagerConfig
      sendResolved: true
      title: "mytitle"
      text: "mytext"
eof

checked with 4.21.0-0.nightly-2026-02-02-085603, 4.22.0-0.nightly-2026-01-26-181726 which they are not with fix for https://issues.redhat.com/browse/OCPBUGS-67303, PR: https://github.com/openshift/prometheus-operator/pull/358 and compared with 4.22.0-0.nightly-2026-02-02-081748 which the fix https://github.com/openshift/prometheus-operator/pull/358 is in

4.21.0-0.nightly-2026-02-02-085603/4.22.0-0.nightly-2026-01-26-181726, monitoring is degreaded for "unable to get secret \"my-workflow-webhook\": secrets \"my-workflow-webhook\" not found", but for 4.22.0-0.nightly-2026-02-02-081748, monitoring is not degreaded, as tested in https://issues.redhat.com/browse/OCPBUGS-67303?focusedId=28903702&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-28903702, upgrade to another version, would see the upgrade is blocked by the missing secret, maybe it's late to notice customer, for the CI upgrade jobs, it will mark the job as failed and need owner to analyze

4.21.0-0.nightly-2026-02-02-085603

$ oc -n openshift-user-workload-monitoring get cm user-workload-monitoring-config -oyaml
apiVersion: v1
data:
  config.yaml: |
    alertmanager:
      enabled: true
      enableAlertmanagerConfig: true
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"config.yaml":"alertmanager:\n  enabled: true\n  enableAlertmanagerConfig: true\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"user-workload-monitoring-config","namespace":"openshift-user-workload-monitoring"}}
  creationTimestamp: "2026-02-03T07:58:22Z"
  labels:
    app.kubernetes.io/managed-by: cluster-monitoring-operator
    app.kubernetes.io/part-of: openshift-monitoring
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring
  resourceVersion: "40811"
  uid: 2c83db6b-79d6-4572-9d05-2a61322c1668

$ date -u;oc -n openshift-user-workload-monitoring logs deploy/prometheus-operator | grep my-workflow-webhook | tail -n1
Tue Feb  3 08:36:55 UTC 2026
ts=2026-02-03T08:36:35.049504365Z level=error caller=/go/src/github.com/coreos/prometheus-operator/pkg/operator/resource_reconciler.go:678 msg="Unhandled Error" logger=UnhandledError err="sync \"openshift-user-workload-monitoring/user-workload\" failed: provision alertmanager configuration: failed to generate Alertmanager configuration: AlertmanagerConfig noodles/example: MSTeamsConfig[0]: unable to get secret \"my-workflow-webhook\": secrets \"my-workflow-webhook\" not found"

$ oc get co monitoring
NAME         VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
monitoring   4.21.0-0.nightly-2026-02-02-085603   False       True          True       10m     UpdatingUserWorkloadAlertmanager: waiting for Alertmanager User Workload object changes failed: waiting for Alertmanager openshift-user-workload-monitoring/user-workload: context deadline exceeded: condition Reconciled: status False: reason ReconciliationFailed: provision alertmanager configuration: failed to generate Alertmanager configuration: AlertmanagerConfig noodles/example: MSTeamsConfig[0]: unable to get secret "my-workflow-webhook": secrets "my-workflow-webhook" not found

4.22.0-0.nightly-2026-01-26-181726

$ oc -n openshift-user-workload-monitoring get cm user-workload-monitoring-config -oyaml
apiVersion: v1
data:
  config.yaml: |
    alertmanager:
      enabled: true
      enableAlertmanagerConfig: true
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"config.yaml":"alertmanager:\n  enabled: true\n  enableAlertmanagerConfig: true\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"user-workload-monitoring-config","namespace":"openshift-user-workload-monitoring"}}
  creationTimestamp: "2026-02-03T07:59:11Z"
  labels:
    app.kubernetes.io/managed-by: cluster-monitoring-operator
    app.kubernetes.io/part-of: openshift-monitoring
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring
  resourceVersion: "94689"
  uid: bd3e17c6-2e8e-448a-b8e8-b170b8837a20$

$ date -u;oc -n openshift-user-workload-monitoring logs deploy/prometheus-operator | grep my-workflow-webhook | tail -n1
Tue Feb  3 08:37:01 AM UTC 2026
ts=2026-02-03T08:36:45.021356744Z level=error caller=/go/src/github.com/coreos/prometheus-operator/pkg/operator/resource_reconciler.go:678 msg="Unhandled Error" logger=UnhandledError err="sync \"openshift-user-workload-monitoring/user-workload\" failed: provision alertmanager configuration: failed to generate Alertmanager configuration: AlertmanagerConfig noodles/example: MSTeamsConfig[0]: unable to get secret \"my-workflow-webhook\": secrets \"my-workflow-webhook\" not found"

$ oc get co monitoring
NAME         VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
monitoring   4.22.0-0.nightly-2026-01-26-181726   False       True          True       11m     UpdatingUserWorkloadAlertmanager: waiting for Alertmanager User Workload object changes failed: waiting for Alertmanager openshift-user-workload-monitoring/user-workload: context deadline exceeded: condition Reconciled: status False: reason ReconciliationFailed: provision alertmanager configuration: failed to generate Alertmanager configuration: AlertmanagerConfig noodles/example: MSTeamsConfig[0]: unable to get secret "my-workflow-webhook": secrets "my-workflow-webhook" not found

4.22.0-0.nightly-2026-02-02-081748

$ oc -n openshift-user-workload-monitoring get cm user-workload-monitoring-config -oyaml
apiVersion: v1
data:
  config.yaml: |
    alertmanager:
      enabled: true
      enableAlertmanagerConfig: true
kind: ConfigMap
metadata:
  creationTimestamp: "2026-02-03T07:36:19Z"
  labels:
    app.kubernetes.io/managed-by: cluster-monitoring-operator
    app.kubernetes.io/part-of: openshift-monitoring
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring
  resourceVersion: "36710"
  uid: c65ab372-e759-46fd-84e9-f297de979ab0

$ date -u;oc -n openshift-user-workload-monitoring logs deploy/prometheus-operator | grep my-workflow-webhook | tail -n1
Tue Feb  3 08:37:04 AM UTC 2026
ts=2026-02-03T08:23:08.020926765Z level=info caller=/go/src/github.com/coreos/prometheus-operator/vendor/k8s.io/client-go/tools/events/event_broadcaster.go:338 msg="Event occurred" object.name=example object.namespace=noodles kind=AlertmanagerConfig apiVersion=monitoring.coreos.com/v1alpha1 type=Warning reason=InvalidConfiguration action=SelectingAlertmanagerConfigResources note="AlertmanagerConfig example was rejected due to invalid configuration: unable to get secret \"my-workflow-webhook\": secrets \"my-workflow-webhook\" not found"

$ oc get co monitoring
NAME         VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
monitoring   4.22.0-0.nightly-2026-02-02-081748   True        False         False      149m

Version-Release number of selected component (if applicable):

4.22 payload with fix https://github.com/openshift/prometheus-operator/pull/358

How reproducible:

always

Steps to Reproduce:

1. see the descriptions

Actual results:

4.22 payload with fix https://github.com/openshift/prometheus-operator/pull/358, monitoring is not degraded

Expected results:

4.22 payload with fix https://github.com/openshift/prometheus-operator/pull/358, monitoring is degraded

Additional info:

Assignee:: Jayapriya Pai

Reporter:: Junqi Zhao

Need Info From:: None

Contributors:: None

QA Contact:: Junqi Zhao

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2026/02/03 8:56 AM

Updated:: 2026/02/11 3:12 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates