-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
ACM 2.11.0
-
1
-
False
-
-
False
-
-
-
MCO Sprint 25, MCO Sprint 26
-
No
Description of problem:
Installed a cluster with Openshift 4.16.0-rc3. The Cluster Montoring fails about running the alert manager. It is in "crashloopback" error.
This happens in the 3 different contains of rbac-proxy-*
Message: I0606 16:53:55.542478 1 kube-rbac-proxy.go:530] Reading config file: /etc/kube-rbac-proxy/config.yaml E0606 16:53:55.543185 1 run.go:74] "command failed" err="failed to load kubeconfig: cannot find Service Account in pod to build in-cluster rest config: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory"
The monitoring CO is in this state:
monitoring 4.16.0-rc.3 False True True 10m UpdatingAlertmanager: waiting for Alertmanager object changes failed: waiting for Alertmanager openshift-monitoring/main: context deadline exceeded
In both we see it is not trying to mount anything inside `/var/run/secrets/kubernetes.io/serviceaccount/`
kube-rbac-proxy: Container ID: cri-o://7b52d7551fb9d32cebc35a595fe5c2657e66bc144e2e9c7f5f30357db60f4c22 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b Port: 9092/TCP Host Port: 0/TCP Args: --secure-listen-address=0.0.0.0:9092 --upstream=http://127.0.0.1:9096 --config-file=/etc/kube-rbac-proxy/config.yaml --tls-cert-file=/etc/tls/private/tls.crt --tls-private-key-file=/etc/tls/private/tls.key --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 --tls-min-version=VersionTLS12 State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Message: I0606 16:53:55.643904 1 kube-rbac-proxy.go:530] Reading config file: /etc/kube-rbac-proxy/config.yaml E0606 16:53:55.644944 1 run.go:74] "command failed" err="failed to load kubeconfig: cannot find Service Account in pod to build in-cluster rest config: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory" Exit Code: 1 Started: Thu, 06 Jun 2024 12:53:55 -0400 Finished: Thu, 06 Jun 2024 12:53:55 -0400 Ready: False Restart Count: 21 Limits: management.workload.openshift.io/cores: 1 Requests: management.workload.openshift.io/cores: 1 memory: 15Mi Environment: <none> Mounts: /etc/kube-rbac-proxy from secret-alertmanager-kube-rbac-proxy (rw) /etc/tls/private from secret-alertmanager-main-tls (rw)
In a working environment, OCP4.14, I see how it is trying to mount the SA for the same container (kube-rbac-proxy):
In the Pod.spec we dont see the usual mounts with the secrets for the ServiceAccount.
kube-rbac-proxy: Container ID: cri-o://7b52d7551fb9d32cebc35a595fe5c2657e66bc144e2e9c7f5f30357db60f4c22 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b Port: 9092/TCP Host Port: 0/TCP Mounts: /etc/kube-rbac-proxy from secret-alertmanager-kube-rbac-proxy (rw) /etc/tls/private from secret-alertmanager-main-tls (rw)
The clusters, is a managed cluster by an ACM Management Cluster.
Dont know if related, for a while we had observability enabled. After disabling the error is the same.
Version-Release number of selected component (if applicable):
4.16.0-rc3
How reproducible:
Creating an SNO as a managed cluster with RHACM Assisted Service
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info: