-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.14.0
-
None
-
Moderate
-
No
-
MON Sprint 242, MON Sprint 243, MON Sprint 244
-
3
-
Rejected
-
False
-
-
Before the fix, Thanos Querier was unable to query pod metrics owing to a bug that caused its supporting Kube RBAC Proxy instance to disallow `metrics.k8s.io/v1beta1/pods`. This patch addresses that.
-
Bug Fix
-
In Progress
Description of problem:
The expected minimal permissions to access the tenancy port on the thanos-querier service in the openshift-monitoring namespace are not working, and instead of them are working different permissions. And different permissions are needed for GET requests and different for POST requests. I am trying to use the tenancy port on the thanos-querier service in the openshift-monitoring namespace. I want a pod to access these metrics and thus I want to only add the minimal necessary permissions to that pod. From Slack discussions and the configuration for the thanos-querier (https://github.com/openshift/cluster-monitoring-operator/blob/release-4.11/assets/thanos-querier/kube-rbac-proxy-secret.yaml) one would expect that the needed permissions are: ``` rules: - verbs: - get apiGroups: - metrics.k8s.io/v1beta1 resources: - pods ``` However, when binding such a role to a service account (and waiting a little bit for the update to propagate across the system), I get an error from inside its container: ``` sh-5.1$ curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" 'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=app' Forbidden (user=system:serviceaccount:app:default, verb=get, resource=pods, subresource=) ``` The error messages suggests that the service account doesn't have the permissions needed. Changing the role's rules and waiting a little bit for the update to propagate across the system seems to fix this. Note the different `apiGroups`: ``` rules: - verbs: - get apiGroups: - '' resources: - pods ``` This results in successfully connecting to the tenancy port: ``` sh-5.1$ curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" 'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=app&query=up' {"status":"success","data":{"resultType":"vector","result":[]}} ``` A similar issue also affects POST requests to the tenancy port. It would be expected that the minimal needed permissions are the same when making GET or POST requests. However, this is not the case. GET requests demand the verb `get` and POST request demand the verb `create`. When using a service account with a Role having rules as: ``` rules: - verbs: - get apiGroups: - '' resources: - pods ``` I am getting this error for POST. (Note the used flags -X GET/-X POST and the verb in the error output). ``` sh-4.4$ curl --cacert /etc/kubernetes/certs/service-ca/service-ca.crt -X GET -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" 'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=clusters-dhurta-test-aws' sh-4.4$ sh-4.4$ curl --cacert /etc/kubernetes/certs/service-ca/service-ca.crt -X POST -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" 'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=clusters-dhurta-test-aws' Forbidden (user=system:serviceaccount:clusters-dhurta-test-aws:cluster-version-operator, verb=create, resource=pods, subresource=) ``` Changing the rules to: ``` rules: - verbs: - get - create apiGroups: - '' resources: - pods ``` Seems to fix the issues for POST. ``` sh-4.4$ curl --cacert /etc/kubernetes/certs/service-ca/service-ca.crt -X GET -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" 'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=clusters-dhurta-test-aws' sh-4.4$ curl --cacert /etc/kubernetes/certs/service-ca/service-ca.crt -X POST -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" 'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=clusters-dhurta-test-aws' ```
Version-Release number of selected component (if applicable):
4.14.0-0.nightly-2023-07-20-215234
How reproducible:
2/2
Steps to Reproduce:
1. Use the Cluster Bot to launch a 4.14 nightly cluster (`launch 4.14 aws`) 2. Create a dummy namespace and launch an application inside the namespace. 3. Create a role in the namespace with the rules set to: ``` rules: - verbs: - get apiGroups: - metrics.k8s.io/v1beta1 resources: - pods ``` 4. Create a role binding and bind the role to the app's service account 5. Access the terminal inside the app's container 6. Access the tenancy port of thanos-querier using a POST request ``` curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt -X POST -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" 'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=$NAMESPACE' ```
Actual results:
When running the `curl` command the output is: Forbidden (user=system:serviceaccount:app:default, verb=create, resource=pods, subresource=)
Expected results:
Successfully connecting and receiving specified metrics. For example: {"status":"success","data":{"resultType":"vector","result":[]}}
Additional info:
I wasn't sure whether to mark this bug as a security related issue. I am marking this bug `Security Level: Red Hat Employee` because the bug is regarding the authorization to access user workload metrics.
- blocks
-
OCPBUGS-34023 The expected minimal permissions to access tenancy port of thanos-querier service do not work
- Closed
-
OTA-855 Point hosted CVO at the management-cluster Thanos
- Closed
- is cloned by
-
OCPBUGS-34023 The expected minimal permissions to access tenancy port of thanos-querier service do not work
- Closed
- links to
-
RHEA-2023:7198 rpm