Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.11.z
Component/s: kube-apiserver
Labels:
- api

Regression:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Blocked by Bugzilla Bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2094362
Target Version:

4.11.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

This bug is a backport clone of [Bugzilla Bug 2094362](https://bugzilla.redhat.com/show_bug.cgi?id=2094362). The following is the description of the original bug:
—
Description of problem:
A change [1] was introduced to split the kube-apiserver SLO rules into 2 groups to reduce the load on Prometheus (see bug 2004585).

[1] https://github.com/openshift/cluster-kube-apiserver-operator/commit/4a1751ee86cda37f0d9ea520cac09f91ebc3abe9

Version-Release number of selected component (if applicable):
4.9 (because the change was backported to 4.9.z)

How reproducible:
Always

Steps to Reproduce:
1. Install OCP 4.9
2. Retrieve kube-apiserver-slos*
oc get -n openshift-kube-apiserver prometheusrules kube-apiserver-slos -o yaml
oc get -n openshift-kube-apiserver prometheusrules kube-apiserver-slos-basic -o yaml

Actual results:

The KubeAPIErrorBudgetBurn alert with labels

{long="1h",namespace="openshift-kube-apiserver",severity="critical",short="5m"}

exists both in kube-apiserver-slos and kube-apiserver-slos-basic.

The alerting rules is evaluated twice. The same is true for recording rules like "apiserver_request:burnrate1h" and in this case, it can trigger warning logs in the Prometheus pods:

> level=warn component="rule manager" group=kube-apiserver.rules msg="Error on ingesting out-of-order result from rule evaluation" numDropped=283

Expected results:

I presume that kube-apiserver-slos shouldn't exist since it's been replaced by kube-apiserver-slos-basic and kube-apiserver-slos-extended.

Additional info:

Discovered while investigating bug 2091902

links to

openshift/cluster-kube-apiserver-operator#1397: [release-4.11] OCPBUGS-2938: Duplicate prometheus rules for API SLOs after upgrade

Assignee:: Luis Sanchez

Reporter:: OpenShift Prow Bot

QA Contact:: Deepak Punia (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2022/10/28 2:33 AM

Updated:: 2023/02/07 1:22 PM

Resolved:: 2023/02/07 1:22 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates