Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-49980

2 "sum:apiserver_request:burnrate5m" recording rule for 4.19

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.19.0
    • kube-apiserver
    • Moderate
    • Yes
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required

      Description of problem:

      $ oc get clusterversion
      NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.19.0-0.nightly-2025-02-07-024732   True        False         133m    Cluster version is 4.19.0-0.nightly-2025-02-07-024732
      
      $ oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0 | grep "Error on ingesting results from rule evaluation with different value but same timestamp" 
      ts=2025-02-07T06:50:32.371Z caller=group.go:599 level=warn name=sum:apiserver_request:burnrate5m index=11 component="rule manager" file=/etc/prometheus/rules/prometheus-k8s-rulefiles-0/openshift-kube-apiserver-kube-apiserver-slos-basic-4c40cd93-505e-4e93-a53c-fdbb47f77d9d.yaml group=kube-apiserver.rules msg="Error on ingesting results from rule evaluation with different value but same timestamp" num_dropped=1
      ts=2025-02-07T06:51:02.376Z caller=group.go:599 level=warn name=sum:apiserver_request:burnrate5m index=11 component="rule manager" file=/etc/prometheus/rules/prometheus-k8s-rulefiles-0/openshift-kube-apiserver-kube-apiserver-slos-basic-4c40cd93-505e-4e93-a53c-fdbb47f77d9d.yaml group=kube-apiserver.rules msg="Error on ingesting results from rule evaluation with different value but same timestamp" num_dropped=1
      ....    

      checked, there are 2 "sum:apiserver_request:burnrate5m" recording rule for 4.19, the second one should be "sum:apiserver_request:burnrate6h", not "sum:apiserver_request:burnrate5m"

      $ oc -n openshift-kube-apiserver get prometheusrules kube-apiserver-slos-basic -oyaml
      ...
          - expr: |
              sum(apiserver_request:burn5m)
              /
              sum by (cluster) (rate(apiserver_request_total{job="apiserver",verb=~"LIST|GET|POST|PUT|PATCH|DELETE"}[5m]))
            record: sum:apiserver_request:burnrate5m
      ...
          - expr: |
              sum(apiserver_request:burn6h)
              /
              sum by (cluster) (rate(apiserver_request_total{job="apiserver",verb=~"LIST|GET|POST|PUT|PATCH|DELETE"}[6h]))
            record: sum:apiserver_request:burnrate5m

      issue is in https://github.com/openshift/cluster-kube-apiserver-operator/blob/release-4.19/bindata/assets/alerts/kube-apiserver-slos-basic.yaml#L213-L217

      Version-Release number of selected component (if applicable):

      4.19+    

      How reproducible:

      always

      Steps to Reproduce:

      1. see the descriptions

      Actual results:

      2 "sum:apiserver_request:burnrate5m" recording rule for 4.19

      Expected results:

      only one "sum:apiserver_request:burnrate5m" recording rule for 4.19

      Additional info:

      issue is only with 4.19

              juzhao@redhat.com Junqi Zhao
              juzhao@redhat.com Junqi Zhao
              Ke Wang Ke Wang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: