-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
ACM 2.13.4
-
None
-
Incidents & Support
-
False
-
-
False
-
-
-
contract-priority
-
Moderate
-
None
Description of problem:
When installing a SNO managed cluster, klusterlet agent container restarts throwing
I1124 08:05:56.342822 1 event.go:377] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"open-cluster-management-agent", Name:"klusterlet-agent", UID:"777f027c-7779-42f8-80c0-abbc777ce35", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'Deployment Updated' Updated open-cluster-management-agent-addon/config-policy-controller
fatal error: concurrent map read and map writegoroutine 922 [running]:
github.com/openshift/library-go/pkg/operator/resource/resourceapply.(*resourceCache).SafeToSkipApply(0xc00060a6f8, {0x3503ca8?, 0xc001baba40?}, {0x3503ca8, 0xc001babcc0})
github.com/openshift/library-go@v0.0.0-20241107160307-0064ad7bd060/pkg/operator/resource/resourceapply/resource_cache.go:148 +0x13a
github.com/openshift/library-go/pkg/operator/resource/resourceapply.ApplySecretImproved({0x352da48, 0xc0007144b0}, {0x7f5ac46bd310, 0xc000d8c9c0}, {0x353fb20, 0xc0008730a0}, 0xc001baba40, {0x3504b88, 0xc00060a6f8})
github.com/openshift/library-go@v0.0.0-20241107160307-0064ad7bd060/pkg/operator/resource/resourceapply/core.go:368 +0x123
github.com/openshift/library-go/pkg/operator/resource/resourceapply.ApplyDirectly({0x352da48, 0xc0007144b0}, 0xc001bb4cc0, {0x353fb20, 0xc0008730a0}, {0x3504b88, 0xc00060a6f8}, 0xc001bb4d40, {0xc001bb4d30, 0x1, ...})
github.com/openshift/library-go@v0.0.0-20241107160307-0064ad7bd060/pkg/operator/resource/resourceapply/generic.go:143 +0xe7e
open-cluster-management.io/ocm/pkg/work/spoke/apply.(*UpdateApply).Apply(0xc001604800, {0x352da48, 0xc0007144b0}, {{0x0, 0x0}, {0xc001221e4a, 0x2}, {0xc0019af180, 0x7}}, 0xc00060af68, ...)
open-cluster-management.io/ocm/pkg/work/spoke/apply/update_apply.go:58 +0x1dc
open-cluster-management.io/ocm/pkg/work/spoke/controllers/manifestcontroller.(*manifestworkReconciler).applyOneManifest(_, {_, _}, _, {{{_, _, _}, {_, _}}}, {{{0xc001c70248, ...}}, ...}, ...)
open-cluster-management.io/ocm/pkg/work/spoke/controllers/manifestcontroller/manifestwork_reconciler.go:206 +0x913
The issue seems to be the same described in this PR on github
Version-Release number of selected component (if applicable):
Managed cluster:4.20.4
ACM: 2.13.3
MCE: 2.8.3
How reproducible:
Deploy a spoke cluster
Steps to Reproduce:
- ...
Actual results:
klusterlet agent crashes
Expected results:
klusterlet agent doesn't crash
Additional info:
Attached to the case
- acm mg
- klusterlet logs
- open-cluster-management-agent namespace inspect