Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-17944

ACM 2.11 installation fails due to caching limits in MultiClusterHub operator

XMLWordPrintable

    • Installer Sprint 2025-53, Installer Sprint 2025-54
    • Important
    • None

      Description of problem:

      In ACM 2.11.5, if a user has a very large cluster that have a lot of operators running, there is a chance where the MCH operator could fail to list/get ClusterServiceVersion resources. This is happening because the client-go that the MCH operator is utilizing is having issues with caching the amount of resources available.

      The multiclusterhub-operator logs are filled with below errors:
      2025-02-13T05:55:07.111684784Z E0213 05:55:07.111672       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229: Failed to watch *v1alpha1.ClusterServiceVersion: failed to list *v1alpha1.ClusterServiceVersion: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 411; INTERNAL_ERROR; received from peer
      2025-02-13T05:55:44.926615138Z 2025-02-13T05:55:44.926Z INFO    mce     MCE updated, but did not find required labels: map[]
      2025-02-13T05:57:05.410379736Z W0213 05:57:05.410290       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229: failed to list *v1alpha1.ClusterServiceVersion: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 429; INTERNAL_ERROR; received from peer
      2025-02-13T05:57:05.410464822Z I0213 05:57:05.410368       1 trace.go:236] Trace[1766956569]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229 (13-Feb-2025 05:56:00.296) (total time: 65114ms):
      2025-02-13T05:57:05.410464822Z Trace[1766956569]: ---"Objects listed" error:stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 429; INTERNAL_ERROR; received from peer 65113ms (05:57:05.410)
      2025-02-13T05:57:05.410464822Z Trace[1766956569]: [1m5.114057034s] [1m5.114057034s] END
      2025-02-13T05:57:05.410464822Z E0213 05:57:05.410388       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229: Failed to watch *v1alpha1.ClusterServiceVersion: failed to list *v1alpha1.ClusterServiceVersion: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 429; INTERNAL_ERROR; received from peer 

       

      Version-Release number of selected component (if applicable):

      ACM 2.11.5

      How reproducible:

      Always for the customer.

      Steps to Reproduce:

      1. Install ACM 2.11.5
      2. Create MultiClusterHub resource

      Actual results:

      MultiClusterHub resource is stuck on: Progressing: True, reason: NewResourceCreated, and the phase is stuck on Installing.

      Expected results:

      MultiClusterHub phase status should reach Running.

      Additional info:

      • OpenShift version: 4.14.32

              dbennett@redhat.com Disaiah Bennett
              dbennett@redhat.com Disaiah Bennett
              Kurtis Wang Kurtis Wang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: