Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-8726

Enable Namespace-Scoped Alerting and Visibility for User Workload Monitoring Metrics

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • Monitoring, User Interface
    • None
    • None
    • Product / Portfolio Work
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Introduce a supported, namespace-scoped alerting and visibility model for User Workload Monitoring that allows:

      1. Namespace-level alert rules
        • Users can define PrometheusRules scoped to their namespace
        • Rules evaluate only metrics relevant to that namespace
      1. Namespace-level alert visibility
        • Users can see:
          • Alert firing state
          • Alert history
          • Alert labels and annotations
        • Without access to cluster-wide monitoring data
      1. Web Console support
        • Enable Observe → Metrics / Alerts UI for namespace-scoped users
        • Limited to User Workload Monitoring data only
      1. Least-privilege RBAC
        • New or enhanced roles enabling:
          • Alert creation
          • Alert viewing
          • Alert querying
        • Without requiring cluster-monitoring-view                             

      Current Behavior{}

      • cert-manager exposes certificate expiration metrics (for example:
        certmanager_certificate_expiration_timestamp_seconds)
      • These metrics can be scraped by User Workload Monitoring
      • However:
        • Namespace-scoped users cannot view alert firing state
        • Namespace-scoped users cannot see Observe → Metrics / Alerts UI
        • Alert evaluation and visibility remain effectively cluster-scoped
      • Granting cluster-monitoring-view exposes all cluster metrics, which violates least-privilege requirements

      Business Impact

      • Security risk: Delayed certificate expiration awareness
      • Operational overhead: Cluster admins must manage alerts for all teams
      • Scalability limitation: Centralized alerting does not scale in multi-tenant clusters
      • Least-privilege violation: Teams must be granted cluster-wide monitoring access to perform basic observability tasks

      Why Existing Workarounds Are Insufficient

      • Centralized alert routing removes self-service ownership
      • External Grafana requires additional infrastructure and manual access control
      • cluster-monitoring-view exposes sensitive cluster metrics
      • CLI / Prometheus API access lacks UI visibility and alert state awareness

      Expected Benefits

      • Improved multi-tenancy support
      • Stronger security posture (least privilege)
      • Reduced operational burden on platform teams
      • Better alignment with Kubernetes namespace ownership model
      • Improved customer adoption of cert-manager and UWM

              rh-ee-rfloren Roger Florén
              rhn-support-skohli Shubh Kohli
              None
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                None
                None