Loading...

XML

Word

Printable

Type: Bug
Resolution: Cannot Reproduce
Priority: Critical
Fix Version/s: None
Affects Version/s: ACM 2.11.0
Component/s: Observability
Labels:
- Obs-Core
- triaged

Story Points:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

Sprint:
MCO Sprint 25, MCO Sprint 26

Regression:
No

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem:

Installed a cluster with Openshift 4.16.0-rc3. The Cluster Montoring fails about running the alert manager. It is in "crashloopback" error.

This happens in the 3 different contains of rbac-proxy-*

      Message:   I0606 16:53:55.542478       1 kube-rbac-proxy.go:530] Reading config file: /etc/kube-rbac-proxy/config.yaml
E0606 16:53:55.543185       1 run.go:74] "command failed" err="failed to load kubeconfig: cannot find Service Account in pod to build in-cluster rest config: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory"

The monitoring CO is in this state:

monitoring                                 4.16.0-rc.3   False       True          True       10m     UpdatingAlertmanager: waiting for Alertmanager object changes failed: waiting for Alertmanager openshift-monitoring/main: context deadline exceeded

In both we see it is not trying to mount anything inside `/var/run/secrets/kubernetes.io/serviceaccount/`

  kube-rbac-proxy:
    Container ID:  cri-o://7b52d7551fb9d32cebc35a595fe5c2657e66bc144e2e9c7f5f30357db60f4c22
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b
    Port:          9092/TCP
    Host Port:     0/TCP
    Args:
      --secure-listen-address=0.0.0.0:9092
      --upstream=http://127.0.0.1:9096
      --config-file=/etc/kube-rbac-proxy/config.yaml
      --tls-cert-file=/etc/tls/private/tls.crt
      --tls-private-key-file=/etc/tls/private/tls.key
      --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
      --tls-min-version=VersionTLS12
    State:       Waiting
      Reason:    CrashLoopBackOff
    Last State:  Terminated
      Reason:    Error
      Message:   I0606 16:53:55.643904       1 kube-rbac-proxy.go:530] Reading config file: /etc/kube-rbac-proxy/config.yaml
E0606 16:53:55.644944       1 run.go:74] "command failed" err="failed to load kubeconfig: cannot find Service Account in pod to build in-cluster rest config: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory"      Exit Code:    1
      Started:      Thu, 06 Jun 2024 12:53:55 -0400
      Finished:     Thu, 06 Jun 2024 12:53:55 -0400
    Ready:          False
    Restart Count:  21
    Limits:
      management.workload.openshift.io/cores:  1
    Requests:
      management.workload.openshift.io/cores:  1
      memory:                                  15Mi
    Environment:                               <none>
    Mounts:
      /etc/kube-rbac-proxy from secret-alertmanager-kube-rbac-proxy (rw)
      /etc/tls/private from secret-alertmanager-main-tls (rw)

In a working environment, OCP4.14, I see how it is trying to mount the SA for the same container (kube-rbac-proxy):

In the Pod.spec we dont see the usual mounts with the secrets for the ServiceAccount.

 kube-rbac-proxy:                                                                 
    Container ID:  cri-o://7b52d7551fb9d32cebc35a595fe5c2657e66bc144e2e9c7f5f30357db60f4c22
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:785d655bc4d777104cdd8951fd835df4559415e46f67ab47c8ac913c2764d76b
    Port:          9092/TCP                                                       
    Host Port:     0/TCP                                                                                                                                                                                                              
    Mounts:                                                                    
      /etc/kube-rbac-proxy from secret-alertmanager-kube-rbac-proxy (rw)          
      /etc/tls/private from secret-alertmanager-main-tls (rw)

The clusters, is a managed cluster by an ACM Management Cluster.
Dont know if related, for a while we had observability enabled. After disabling the error is the same.

Version-Release number of selected component (if applicable):

4.16.0-rc3

How reproducible:

Creating an SNO as a managed cluster with RHACM Assisted Service

Steps to Reproduce:

1.
2.
3.

Actual results:

Expected results:

Additional info:

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

image-2024-07-05-09-29-13-074.png
2024/07/05 7:29 AM
80 kB
Jose Gato Luis

Assignee:: Philip Gough

Reporter:: Jose Gato Luis

QA Contact:: Xiang Yin

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2024/06/07 7:37 AM

Updated:: 2024/08/01 2:12 PM

Resolved:: 2024/08/01 2:12 PM

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates