Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-34023

The expected minimal permissions to access tenancy port of thanos-querier service do not work

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • 4.14.z
    • 4.14.0
    • Monitoring
    • Moderate
    • No
    • 1
    • MON Sprint 254
    • 1
    • False
    • Hide

      None

      Show
      None
    • Before the fix, Thanos Querier was unable to query pod metrics owing to a bug that caused its supporting Kube RBAC Proxy instance to disallow `metrics.k8s.io/v1beta1/pods`. This patch addresses that.
    • Bug Fix
    • Done

      This is a clone of issue OCPBUGS-17035. The following is the description of the original issue:

      Description of problem:

      The expected minimal permissions to access the tenancy port on the thanos-querier service in the openshift-monitoring namespace are not working, and instead of them are working different permissions. And different permissions are needed for GET requests and different for POST requests.
      
      I am trying to use the tenancy port on the thanos-querier service in the openshift-monitoring namespace. I want a pod to access these metrics and thus I want to only add the minimal necessary permissions to that pod. From Slack discussions and the configuration for the thanos-querier (https://github.com/openshift/cluster-monitoring-operator/blob/release-4.11/assets/thanos-querier/kube-rbac-proxy-secret.yaml) one would expect that the needed permissions are:
      
      ```
      rules:
        - verbs:
            - get
          apiGroups:
            - metrics.k8s.io/v1beta1
          resources:
            - pods
      ```
      
      However, when binding such a role to a service account (and waiting a little bit for the update to propagate across the system), I get an error from inside its container:
      
      ```
      sh-5.1$ curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt      -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"    'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=app'
      
      Forbidden (user=system:serviceaccount:app:default, verb=get, resource=pods, subresource=)
      ```
      
      The error messages suggests that the service account doesn't have the permissions needed. Changing the role's rules and waiting a little bit for the update to propagate across the system seems to fix this. Note the different `apiGroups`:
      
      ```
      rules:
        - verbs:
            - get
          apiGroups:
            - ''
          resources:
            - pods
      ```
      
      This results in successfully connecting to the tenancy port:
      
      ```
      sh-5.1$ curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt      -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"    'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=app&query=up'
      
      {"status":"success","data":{"resultType":"vector","result":[]}}
      ```
      
      A similar issue also affects POST requests to the tenancy port. It would be expected that the minimal needed permissions are the same when making GET or POST requests. However, this is not the case. GET requests demand the verb `get` and POST request demand the verb `create`.
      
      When using a service account with a Role having rules as:
      
      ```
      rules:
        - verbs:
            - get
          apiGroups:
            - ''
          resources:
            - pods
      ```
      
      I am getting this error for POST. (Note the used flags -X GET/-X POST and the verb in the error output).
      
      ```
      sh-4.4$ curl --cacert /etc/kubernetes/certs/service-ca/service-ca.crt  -X GET    -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"    'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=clusters-dhurta-test-aws'
      sh-4.4$
      sh-4.4$ curl --cacert /etc/kubernetes/certs/service-ca/service-ca.crt  -X POST    -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"    'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=clusters-dhurta-test-aws'
      Forbidden (user=system:serviceaccount:clusters-dhurta-test-aws:cluster-version-operator, verb=create, resource=pods, subresource=)
      ```
      
      Changing the rules to:
      
      ```
      rules:
        - verbs:
            - get
            - create
          apiGroups:
            - ''
          resources:
            - pods
      ```
      
      Seems to fix the issues for POST.
      
      ```
      sh-4.4$ curl --cacert /etc/kubernetes/certs/service-ca/service-ca.crt  -X GET    -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"    'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=clusters-dhurta-test-aws'
      sh-4.4$ curl --cacert /etc/kubernetes/certs/service-ca/service-ca.crt  -X POST    -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"    'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=clusters-dhurta-test-aws'
      ```

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-07-20-215234

      How reproducible:

      2/2

      Steps to Reproduce:

      1. Use the Cluster Bot to launch a 4.14 nightly cluster (`launch 4.14 aws`)
      2. Create a dummy namespace and launch an application inside the namespace.
      3. Create a role in the namespace with the rules set to:
      ```
      rules:
        - verbs:
            - get
          apiGroups:
            - metrics.k8s.io/v1beta1
          resources:
            - pods
      ```
      4. Create a role binding and bind the role to the app's service account
      5. Access the terminal inside the app's container
      6. Access the tenancy port of thanos-querier using a POST request
      ```
      curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt -X POST   -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"    'https://thanos-querier.openshift-monitoring.svc:9092/api/v1/query?namespace=$NAMESPACE'
      ```

      Actual results:

      When running the `curl` command the output is:
      
      Forbidden (user=system:serviceaccount:app:default, verb=create, resource=pods, subresource=)

      Expected results:

      Successfully connecting and receiving specified metrics.
      
      For example:
      {"status":"success","data":{"resultType":"vector","result":[]}}

      Additional info:

      I wasn't sure whether to mark this bug as a security related issue. I am marking this bug `Security Level: Red Hat Employee` because the bug is regarding the authorization to access user workload metrics.

              prasriva@redhat.com Pranshu Srivastava
              openshift-crt-jira-prow OpenShift Prow Bot
              Junqi Zhao Junqi Zhao
              Simon Pasquier
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: