Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: 1.15.0
Affects Version/s: 1.14.0
Component/s: None
Labels:
- qe-verified

Blocked:
False
Ready:
False
Release Note Text:
Undefined
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

On a cluster with ~50 namespaces containing a ksvc, the webhook_request_latencies_bucket{job="webhook-sm-service"} is the single largest metric on the cluster with 29801 time series (which is more than the API servers'

examples

webhook_request_latencies_bucket{admission_allowed="false", endpoint="https", instance="10.128.2.33:8444", job="webhook-sm-service", kind_group="networking.internal.knative.dev", kind_kind="Ingress", kind_version="v1alpha1", le="+Inf", namespace="knative-serving", pod="webhook-576b57b4d6-glxhp", request_operation="UPDATE", resource_group="networking.internal.knative.dev", resource_namespace="helloworld-re-sn-0", resource_resource="ingresses", resource_version="v1alpha1", service="webhook-sm-service"}   436

webhook_request_latencies_bucket{admission_allowed="false", endpoint="https", instance="10.128.2.33:8444", job="webhook-sm-service", kind_group="networking.internal.knative.dev", kind_kind="Ingress", kind_version="v1alpha1", le="+Inf", namespace="knative-serving", pod="webhook-576b57b4d6-glxhp", request_operation="UPDATE", resource_group="networking.internal.knative.dev", resource_namespace="helloworld-re-sn-1", resource_resource="ingresses", resource_version="v1alpha1", service="webhook-sm-service"}   424

It seems it has a time series for each

bucket * namespace * kind * operation

for any namespace with a ksvc, for all kinds of knative kinds...

( there are 17 buckets, and (kind * operation) seems to be around 32, so there are about 544 time series per namespace)

Notice the `resource_namespace` label, which means there are bucket time series for each namespace... (also note that even the OpenShift apiserver_request_duration_seconds_bucket metric doesn't distinguish namespaces, so it IMHO seems excessive to distinguish it for the webhook_request_latencies_bucket metric))

See also the discussion https://coreos.slack.com/archives/CD87JDUB0/p1619024836262000

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

prometheus_too_many_metrics.png
178 kB
2021/04/21 7:49 PM
image-2021-04-23-15-40-10-548.png
111 kB
2021/04/23 12:40 PM
image-2021-04-23-15-41-09-467.png
97 kB
2021/04/23 12:41 PM
image-2021-04-23-15-45-15-976.png
196 kB
2021/04/23 12:45 PM
image-2021-04-23-15-45-29-618.png
194 kB
2021/04/23 12:45 PM

Assignee:: Unassigned

Reporter:: Marek Schmidt

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2021/04/21 7:48 PM

Updated:: 2022/02/11 5:23 PM

Resolved:: 2021/04/30 9:25 AM

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates