Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-12059

AlertMonitoring does not mount the ServiceAccount secrets

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Critical Critical
    • None
    • ACM 2.11.0
    • Observability
    • 1
    • False
    • Hide

      None

      Show
      None
    • False
    • MCO Sprint 25, MCO Sprint 26
    • No

      Description of problem:

      Installed a cluster with Openshift 4.16.0-rc3. The Cluster Montoring fails about running the alert manager. It is in "crashloopback" error.

      This happens in the 3 different contains of rbac-proxy-*

            Message:   I0606 16:53:55.542478       1 kube-rbac-proxy.go:530] Reading config file: /etc/kube-rbac-proxy/config.yaml
      E0606 16:53:55.543185       1 run.go:74] "command failed" err="failed to load kubeconfig: cannot find Service Account in pod to build in-cluster rest config: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory" 

       

      The monitoring CO is in this state:

      monitoring                                 4.16.0-rc.3   False       True          True       10m     UpdatingAlertmanager: waiting for Alertmanager object changes failed: waiting for Alertmanager openshift-monitoring/main: context deadline exceeded 

      In both we see it is not trying to mount anything inside `/var/run/secrets/kubernetes.io/serviceaccount/`

       

        kube-rbac-proxy:
          Container ID:  cri-o://7b52d7551fb9d32cebc35a595fe5c2657e66bc144e2e9c7f5f30357db60f4c22
          Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b
          Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b
          Port:          9092/TCP
          Host Port:     0/TCP
          Args:
            --secure-listen-address=0.0.0.0:9092
            --upstream=http://127.0.0.1:9096
            --config-file=/etc/kube-rbac-proxy/config.yaml
            --tls-cert-file=/etc/tls/private/tls.crt
            --tls-private-key-file=/etc/tls/private/tls.key
            --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
            --tls-min-version=VersionTLS12
          State:       Waiting
            Reason:    CrashLoopBackOff
          Last State:  Terminated
            Reason:    Error
            Message:   I0606 16:53:55.643904       1 kube-rbac-proxy.go:530] Reading config file: /etc/kube-rbac-proxy/config.yaml
      E0606 16:53:55.644944       1 run.go:74] "command failed" err="failed to load kubeconfig: cannot find Service Account in pod to build in-cluster rest config: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory"      Exit Code:    1
            Started:      Thu, 06 Jun 2024 12:53:55 -0400
            Finished:     Thu, 06 Jun 2024 12:53:55 -0400
          Ready:          False
          Restart Count:  21
          Limits:
            management.workload.openshift.io/cores:  1
          Requests:
            management.workload.openshift.io/cores:  1
            memory:                                  15Mi
          Environment:                               <none>
          Mounts:
            /etc/kube-rbac-proxy from secret-alertmanager-kube-rbac-proxy (rw)
            /etc/tls/private from secret-alertmanager-main-tls (rw) 

      In a working environment, OCP4.14, I see how it is trying to mount the SA for the same container (kube-rbac-proxy):

       

      In the Pod.spec we dont see the usual mounts with the secrets for the ServiceAccount.

       kube-rbac-proxy:                                                                 
          Container ID:  cri-o://7b52d7551fb9d32cebc35a595fe5c2657e66bc144e2e9c7f5f30357db60f4c22
          Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b
          Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b
          Port:          9092/TCP                                                       
          Host Port:     0/TCP                                                                                                                                                                                                              
          Mounts:                                                                    
            /etc/kube-rbac-proxy from secret-alertmanager-kube-rbac-proxy (rw)          
            /etc/tls/private from secret-alertmanager-main-tls (rw)      

      The clusters, is a managed cluster by an ACM Management Cluster.
      Dont know if related, for a while we had observability enabled. After disabling the error is the same.

       

      Version-Release number of selected component (if applicable):

      4.16.0-rc3

      How reproducible:

      Creating an SNO as a managed cluster with RHACM Assisted Service

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

       

              pgough@redhat.com Philip Gough
              jgato@redhat.com Jose Gato Luis
              Xiang Yin Xiang Yin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: