Uploaded image for project: 'OCP Technical Release Team'
  1. OCP Technical Release Team
  2. TRT-2153

Rosa Alert Failures for: PrometheusRemoteStorageFailures

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • None
    • None
    • Incidents & Support
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      4.19.0-0.nightly-2025-06-10-072320 and 4.20.0-0.nightly-2025-06-10-072846 started failing due to

            {
              "metric": {
                "__name__": "ALERTS",
                "alertname": "PrometheusRemoteStorageFailures",
                "alertstate": "firing",
                "container": "kube-rbac-proxy",
                "endpoint": "metrics",
                "instance": "10.129.2.9:9092",
                "job": "prometheus-k8s",
                "namespace": "openshift-monitoring",
                "pod": "prometheus-k8s-0",
                "prometheus": "openshift-monitoring/k8s",
                "remote_name": "4f611c",
                "service": "prometheus-k8s",
                "severity": "warning",
                "url": "https://observatorium-mst.api.stage.openshift.com/api/metrics/v1/osd/api/v1/receive"
              },
      

      Slack thread with hcm team kicking off the investigation

              rh-ee-fbabcock Forrest Babcock
              rh-ee-fbabcock Forrest Babcock
              None
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: