-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.16, 4.16.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
The kubelet crashes across clusters, caused by a `fatal error: concurrent map iteration and map write`. This crash has been traced to the FindPluginBySpec() function in the VolumePluginMgr, impacting FlexVolume operations (particularly with the ibmc-s3fs plugin). These crashes lead to stale PV mounts and application disruptions due to FUSE processes not being cleaned up.
In reference to a Kubernetes upstream issue #124839 and its fix PR #129755 , which match the observed behavior. Logs from affected nodes confirm the panic message, and it aligns with known FlexVolume activity during DaemonSet rollouts.
~~~
Jun 30 16:38:04 popular-reptile-x-large-wdc-containers-nonprod1 kubenswrapper[6186]: fatal error: concurrent map iteration and map write Jun 30 16:38:04 popular-reptile-x-large-wdc-containers-nonprod1 kubenswrapper[6186]: goroutine 280507699 [running]: Jun 30 16:38:04 popular-reptile-x-large-wdc-containers-nonprod1 kubenswrapper[6186]: k8s.io/kubernetes/pkg/volume.(*VolumePluginMgr).FindPluginBySpec(0xc0003c4d08, 0xc0079b04e0) Jun 30 16:38:04 popular-reptile-x-large-wdc-containers-nonprod1 kubenswrapper[6186]: k8s.io/kubernetes/pkg/volume/plugins.go:683 +0x327
~~~
As per the slack discussion [1] creating this Bug.
The customer is seeking clarity on whether the upstream fix will be backported to any supported OpenShift 4.x version.
[1] https://redhat-internal.slack.com/archives/CK1AE4ZCK/p1753114688440409
Actual results:
The customer is seeking clarity on whether the upstream fix will be backported to any supported OpenShift 4.x version.
Expected results:
Additional info: