-
Bug
-
Resolution: Done-Errata
-
Major
-
premerge
-
None
-
Moderate
-
No
-
MON Sprint 235
-
1
-
Rejected
-
False
-
-
-
Bug Fix
-
Done
Description of problem:
tested https://issues.redhat.com/browse/OCPBUGS-10387 with PR
launch 4.14-ci,openshift/cluster-monitoring-operator#1926 no-spot
3 masters, 3 workers, each node is with 4 cpus, no infra node
$ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-132-193.us-east-2.compute.internal Ready control-plane,master 23m v1.26.2+d2e245f ip-10-0-135-65.us-east-2.compute.internal Ready control-plane,master 23m v1.26.2+d2e245f ip-10-0-149-72.us-east-2.compute.internal Ready worker 14m v1.26.2+d2e245f ip-10-0-158-0.us-east-2.compute.internal Ready worker 14m v1.26.2+d2e245f ip-10-0-229-135.us-east-2.compute.internal Ready worker 17m v1.26.2+d2e245f ip-10-0-234-36.us-east-2.compute.internal Ready control-plane,master 23m v1.26.2+d2e245f
labels see below
control-plane: node-role.kubernetes.io/control-plane: "" master: node-role.kubernetes.io/master: "" worker: node-role.kubernetes.io/worker: ""
search with "cluster:capacity_cpu_cores:sum" on admin console "Observe -> Metrics", label_node_role_kubernetes_io=master and label_node_role_kubernetes_io="" are both calculated twice
Name label_beta_kubernetes_io_instance_type label_kubernetes_io_arch label_node_openshift_io_os_id label_node_role_kubernetes_io prometheus Value cluster:capacity_cpu_cores:sum m6a.xlarge amd64 rhcos openshift-monitoring/k8s 12 cluster:capacity_cpu_cores:sum m6a.xlarge amd64 rhcos master openshift-monitoring/k8s 12 cluster:capacity_cpu_cores:sum m6a.xlarge amd64 rhcos openshift-monitoring/k8s 12 cluster:capacity_cpu_cores:sum m6a.xlarge amd64 rhcos master openshift-monitoring/k8s 12
checked from thanos-querier API, same result with that from console UI(console UI used thanos-querier API)
$ token=`oc create token prometheus-k8s -n openshift-monitoring` $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?' --data-urlencode 'query=cluster:capacity_cpu_cores:sum' | jq { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "cluster:capacity_cpu_cores:sum", "label_beta_kubernetes_io_instance_type": "m6a.xlarge", "label_kubernetes_io_arch": "amd64", "label_node_openshift_io_os_id": "rhcos", "prometheus": "openshift-monitoring/k8s" }, "value": [ 1682394655.248, "12" ] }, { "metric": { "__name__": "cluster:capacity_cpu_cores:sum", "label_beta_kubernetes_io_instance_type": "m6a.xlarge", "label_kubernetes_io_arch": "amd64", "label_node_openshift_io_os_id": "rhcos", "label_node_role_kubernetes_io": "master", "prometheus": "openshift-monitoring/k8s" }, "value": [ 1682394655.248, "12" ] }, { "metric": { "__name__": "cluster:capacity_cpu_cores:sum", "label_beta_kubernetes_io_instance_type": "m6a.xlarge", "label_kubernetes_io_arch": "amd64", "label_node_openshift_io_os_id": "rhcos", "prometheus": "openshift-monitoring/k8s" }, "value": [ 1682394655.248, "12" ] }, { "metric": { "__name__": "cluster:capacity_cpu_cores:sum", "label_beta_kubernetes_io_instance_type": "m6a.xlarge", "label_kubernetes_io_arch": "amd64", "label_node_openshift_io_os_id": "rhcos", "label_node_role_kubernetes_io": "master", "prometheus": "openshift-monitoring/k8s" }, "value": [ 1682394655.248, "12" ] } ] } }
no such issue if we query the expr for "cluster:capacity_cpu_cores:sum" directly
Name label_beta_kubernetes_io_instance_type label_kubernetes_io_arch label_node_openshift_io_os_id label_node_role_kubernetes_io prometheus Value cluster:capacity_cpu_cores:sum m6a.xlarge amd64 rhcos openshift-monitoring/k8s 12 cluster:capacity_cpu_cores:sum m6a.xlarge amd64 rhcos master openshift-monitoring/k8s 12
should do deduplication for thanos-querier API
Version-Release number of selected component (if applicable):
tested https://issues.redhat.com/browse/OCPBUGS-10387 with PR
How reproducible:
always
Steps to Reproduce:
1. see the description 2. 3.
Actual results:
node role is calculated twice in thanos-querier API
Expected results:
node role should be calculated only once in thanos-querier API