When the control plane nodes are under pressure or the apiserver is just not available, no telemetry data is emitted by the monitoring stack although monitoring isn't on master node and shouldn't have to interact with the control plane in order to push metrics.
This is caused by the fact that today telemeter-client is evaluating promQL expressions on Prometheus via an oauth-proxy endpoint that requires talking to the apiserver to be authenticated.
After discussing with firstname.lastname@example.org, a potential solution to remove the dependency on the apiserver would be to use mTLS communication between telemeter-client and the Prometheus pods.
Today, there are 3 proxies in the Prometheus pods:
- oauth proxy for the API
- kube-rbac-proxy for prometheus metrics
- kube-rbac-proxy for thanos sidecar
The kube-rbac-proxy exposing the /metrics endpoint could be used by telemeter-client since it is already doing so via mTLS.
Note that this approach would require improving telemeter-client since it doesn't support configure TLS certs/keys.