Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-14586

[ACM 2.10] multicluster-observability-operator pod restarts with fatal error concurrent map writes

XMLWordPrintable

    • 1
    • False
    • None
    • False
    • MCO Sprint 27, MCO Sprint 30, Observability Sprint 31, Observability Sprint 32, Observability Sprint 33, Observability Sprint 34
    • Low
    • No

      Description of problem:

      During ACM perf/scale test multicluster-observability-operator pod restarted 4 times and survived after the last restart for at least 15 hours before we destroy the environment for next test run. after discussion with smeduri1@redhat.com , creating this as a minor bug to track the issue
      below was the error from previous pod logs:

      2024-06-25T00:21:03.504Z INFO controller_placementrule Reconciling PlacementRule {"Request.Namespace": "vm00700", "Request.Name": "vm00700-observability"} fatal error: concurrent map writes goroutine 1128 [running]: github.com/stolostron/multicluster-observability-operator/operators/multiclusterobservability/controllers/placementrule.(*PlacementRuleReconciler).Reconcile(0xc000b521e0, {0x0?, 0x0?}, {0xc02abb9736?, 0x5?}, {0xc01cdca8d0?, 0xc03bd8dd10?}) /remote-source/app/operators/multiclusterobservability/controllers/placementrule/placementrule_controller.go:99 +0x1db sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x4355b10?, {0x434e080?, 0xc027d08330?}, {0xc02abb9736?, 0xb?}, {0xc01cdca8d0?, 0x0?}) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119 +0xb7 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc014616280, {0x434e0b8, 0xc000f8dbd0}, {0x38162a0, 0xc026a6f640}) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316 +0x3bc sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc014616280, {0x434e0b8, 0xc000f8dbd0}) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266 +0x1be sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2() /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227 +0x79 created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 806 /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:223 +0x50c goroutine 1 [select, 165 minutes]: sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).Start(0xc0012e89c0, {0x434e0b8, 0xc000ac5090}) /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:462 +0x912 main.main() /remote-source/app/operators/multiclusterobservability/main.go:336 +0x2d42 goroutine 80 [chan receive]: k8s.io/klog.(*loggingT).flushDaemon(0x5c37740) /remote-source/app/vendor/k8s.io/klog/klog.go:1010 +0x66 created by k8s.io/klog.init.0 in goroutine 1 /remote-source/app/vendor/k8s.io/klog/klog.go:411 +0xd2 goroutine 197 [chan receive, 165 minutes]: k8s.io/client-go/tools/cache.(*controller).Run.func1() /remote-source/app/vendor/k8s.io/client-go/tools/cache/controller.go:132 +0x25 created by k8s.io/client-go/tools/cache.(*controller).Run in goroutine 278 /remote-source/app/vendor/k8s.io/client-go/tools/cache/controller.go:131 +0xa9 goroutine 196 [chan receive, 165 minutes]: k8s.io/client-go/tools/cache.(*sharedProcessor).run(0xc000f8c3c0, 0xc000e4c120) /remote-source/app/vendor/k8s.io/client-go/tools/cache/shared_informer.go:801 +0x4d k8s.io/client-go/tools/cache.(*sharedIndexInformer).Run.(*Group).StartWithChannel.func4() /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:55 +0x1b k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1() /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:72 +0x52 created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start in goroutine 278 /remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:70 +0x73 goroutine 149408 [select, 1 minutes]: golang.org/x/net/http2.(*clientStream).writeRequest(0xc00de87680, 0xc04800ad80, 0x0) /remote-source/app/vendor/golang.org/x/net/http2/transport.go:1536 +0xa85 golang.org/x/net/http2.(*clientStream).doRequest(0xc00de87680, 0x434e080?, 0xc00104dc20?) /remote-source/app/vendor/golang.org/x/net/http2/transport.go:1414 +0x56 created by golang.org/x/net/http2.(*ClientConn).roundTrip in goroutine 1272 /remote-source/app/vendor/golang.org/x/net/http2/transport.go:1319 +0x3e5 goroutine 212 [runnable]: crypto/tls.(*Conn).readFromUntil(0xc0002e8388?, {0x4321fa0?, 0xc00047a5f8?}, 0x73c?) /usr/lib/golang/src/crypto/tls/conn.go:819 +0x129 crypto/tls.(*Conn).readRecordOrCCS(0xc0002e8388, 0x0) /usr/lib/golang/src/crypto/tls/conn.go:677 +0xd3e crypto/tls.(*Conn).readRecord(...) /usr/lib/golang/src/crypto/tls/conn.go:588 crypto/tls.(*Conn).Read(0xc0002e8388, {0xc000d3a000, 0x1000, 0x17add51?}) /usr/lib/golang/src/crypto/tls/conn.go:1370 +0x156 bufio.(*Reader).Read(0xc000d388a0, {0xc00027cc80, 0x9, 0x0?}) /usr/lib/golang/src/bufio/bufio.go:241 +0x197 io.ReadAtLeast({0x43222a0, 0xc000d388a0}, {0xc00027cc80, 0x9, 0x9}, 0x9) /usr/lib/golang/src/io/io.go:335 +0x90 io.ReadFull(...) /usr/lib/golang/src/io/io.go:354 golang.org/x/net/http2.readFrameHeader({0xc00027cc80, 0x9, 0x1492dc0?}, {0x43222a0?, 0xc000d388a0?}) /remote-source/app/vendor/golang.org/x/net/http2/frame.go:237 +0x65 golang.org/x/net/http2.(*Framer).ReadFrame(0xc00027cc40) /remote-source/app/vendor/golang.org/x/net/http2/frame.go:501 +0x85 golang.org/x/net/http2.(*clientConnReadLoop).run(0xc001492fa8) /remote-source/app/vendor/golang.org/x/net/http2/transport.go:2358 +0xda golang.org/x/net/http2.(*ClientConn).readLoop(0xc000b33980) /remote-source/app/vendor/golang.org/x/net/http2/transport.go:2254 +0x8b created by golang.org/x/net/http2.(*Transport).newClientConn in goroutine 211 /remote-source/app/vendor/golang.org/x/net/http2/transport.go:869 +0xd1b

      Version-Release number of selected component (if applicable):

      How reproducible:

      Steps to Reproduce:

      1.  
      2.  
      3. ...

      Actual results:

      Expected results:

      Additional info:

              rh-ee-coquadro Coleen Iona Quadros
              rhn-support-txue Ting Xue
              Xiang Yin Xiang Yin
              ACM QE Team
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: