Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-2551

upgrade from 2.6 to 2.7 multiclusterhub-operator crash loops - timed out waiting for cache to be synced

XMLWordPrintable

    • False
    • None
    • False
    • No
    • ACM Sprint 26, ACM Sprint 27

      Description of problem:

      1. You have already deployed 3000+ SNOs with ACM and ZTP in an ACM 2.6.3 hub
      2. Then you attempt to upgrade the 2.6.3 hub to ACM 2.7 release.
      3. What happens is: the multicluster-operator is crashlooping on the Hub when trying to upgrade from 2.6.3 to 2.7.0.

      Test:

      Version-Release number of selected component (if applicable):

      2.6.3-DOWNSTREAM-2022-11-18-01-12-48

      2.7.0-DOWNSTREAM-2022-12-21-02-12-51(fc14).

      OCP 4.11.13 (Hub and managedclusters)

      How reproducible:

      Steps to Reproduce:

      1.  
      2.  
      3. ...

      Actual results:

      Expected results:

      Additional info:

      Errors in the log:

       

      W1222 03:18:18.502771       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.2/tools/cache/reflector.go:169: failed to list *v1.Secret: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 529; INTERNAL_ERROR; received from peer
      I1222 03:18:18.502858       1 trace.go:205] Trace[1009619412]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.25.2/tools/cache/reflector.go:169 (22-Dec-2022 03:17:12.627) (total time: 65875ms):
      Trace[1009619412]: ---"Objects listed" error:stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 529; INTERNAL_ERROR; received from peer 65874ms (03:18:18.502)
      Trace[1009619412]: [1m5.875052249s] [1m5.875052249s] END
      E1222 03:18:18.502878       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.2/tools/cache/reflector.go:169: Failed to watch *v1.Secret: failed to list *v1.Secret: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 529; INTERNAL_ERROR; received from peer
      I1222 03:18:39.097732       1 trace.go:205] Trace[877137556]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.25.2/tools/cache/reflector.go:169 (22-Dec-2022 03:17:30.898) (total time: 68198ms):
      Trace[877137556]: ---"Objects listed" error:<nil> 67800ms (03:18:38.699)
      Trace[877137556]: [1m8.198987171s] [1m8.198987171s] END
      W1222 03:19:21.703052       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.2/tools/cache/reflector.go:169: failed to list *v1.Secret: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 657; INTERNAL_ERROR; received from peer
      I1222 03:19:21.703104       1 trace.go:205] Trace[104547202]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.25.2/tools/cache/reflector.go:169 (22-Dec-2022 03:18:19.597) (total time: 62105ms):
      Trace[104547202]: ---"Objects listed" error:stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 657; INTERNAL_ERROR; received from peer 62105ms (03:19:21.703)
      Trace[104547202]: [1m2.105293557s] [1m2.105293557s] END
      E1222 03:19:21.703115       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.2/tools/cache/reflector.go:169: Failed to watch *v1.Secret: failed to list *v1.Secret: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 657; INTERNAL_ERROR; received from peer
      1.6716791708978999e+09  ERROR   Could not wait for Cache to sync        {"controller": "multiclusterhub", "controllerGroup": "operator.open-cluster-management.io", "controllerKind": "MultiClusterHub", "error": "failed to wait for multiclusterhub caches to sync: timed out waiting for cache to be synced"}
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
              /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:215
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
              /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:241
      sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1
              /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/manager/runnable_group.go:219
      1.6716791708981621e+09  INFO    Stopping and waiting for non leader election runnables
      1.6716791708981748e+09  INFO    Stopping and waiting for leader election runnables
      1.6716791708981948e+09  INFO    Stopping and waiting for caches
      1.6716791708983052e+09  INFO    Stopping and waiting for webhooks
      1.671679170898318e+09   INFO    Wait completed, proceeding to shutdown the manager
      1.6716791708983374e+09  ERROR   setup   problem running manager {"error": "failed to wait for multiclusterhub caches to sync: timed out waiting for cache to be synced"}
      main.main
              /remote-source/app/main.go:204
      runtime.main
              /usr/lib/golang/src/runtime/proc.go:250

       

       

       

            jagray@redhat.com Jakob Gray
            akrzos@redhat.com Alex Krzos
            Ting Xue Ting Xue
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: