Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-2813

During apiserver disruption telemetry data is missing


    • Icon: Epic Epic
    • Resolution: Done
    • Icon: Normal Normal
    • openshift-4.14
    • None
    • None
    • mtls for telemeter-client
    • False
    • None
    • False
    • NEW
    • MON-3155Insights through metric telemetry
    • NEW
    • 0% To Do, 0% In Progress, 100% Done
    • MON Sprint 239

      When the control plane nodes are under pressure or the apiserver is just not available, no telemetry data is emitted by the monitoring stack although monitoring isn't on master node and shouldn't have to interact with the control plane in order to push metrics.

      This is caused by the fact that today telemeter-client is evaluating promQL expressions on Prometheus via an oauth-proxy endpoint that requires talking to the apiserver to be authenticated.

      After discussing with spasquie@redhat.com, a potential solution to remove the dependency on the apiserver would be to use mTLS communication between telemeter-client and the Prometheus pods.

      Today, there are 3 proxies in the Prometheus pods:

      • oauth proxy for the API
      • kube-rbac-proxy for prometheus metrics
      • kube-rbac-proxy for thanos sidecar

      The kube-rbac-proxy exposing the /metrics endpoint could be used by telemeter-client since it is already doing so via mTLS.

      Note that this approach would require improving telemeter-client since it doesn't support configure TLS certs/keys.

            mariofer@redhat.com Mario Fernandez Herrero
            dgrisonn@redhat.com Damien Grisonnet
            Junqi Zhao Junqi Zhao
            0 Vote for this issue
            11 Start watching this issue