Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: 4.22.0
Affects Version/s: 4.18.z
Component/s: Storage
Labels:
- vsphere

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:

4.18.z
Target Version:

4.22.0
Release Blocker:
None
Sprint:
Storage Sprint 283
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
In Progress
Release Note Type:
Release Note Not Required
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

 The vsphere-problem-detector-operator pod crashed and restarted with a fatal error: concurrent map writes. This appears to be a race condition in the SetCbtData function when multiple nodes are being scanned for Changed Block Tracking (CBT) status simultaneously.

Version-Release number of selected component (if applicable):

    4.18.28

Relevant Output:

oc get pods vsphere-problem-detector-operator-5cdfbbd7d8-gmwdw
NAME                                                 READY   STATUS    RESTARTS   AGE
vsphere-problem-detector-operator-5cdfbbd7d8-gmwdw   1/1     Running   1          28d



oc logs  vsphere-problem-detector-operator-5cdfbbd7d8-gmwdw -p
 
2026-01-02T07:54:52.698943047Z I0102 07:54:52.698882       1 node_cbt.go:52] Property not found for node cb-w1.cbdr.iiabank.com.jo

2026-01-02T07:54:52.698943047Z fatal error: concurrent map writes
2026-01-02T07:54:52.698943047Z I0102 07:54:52.698889       1 vsphere_check.go:321] CollectNodeCBT:cb-w3.cbdr.iiabank.com.jo passed
2026-01-02T07:54:52.702159386Z
2026-01-02T07:54:52.702159386Z goroutine 1610214 [running]:
2026-01-02T07:54:52.702206787Z github.com/openshift/vsphere-problem-detector/pkg/util.(*ClusterInfo).SetCbtData(0xc000f808e0?, {0x2e99d0e?, 0x1?})
2026-01-02T07:54:52.702206787Z  github.com/openshift/vsphere-problem-detector/pkg/util/cluster_info.go:151 +0xa5
2026-01-02T07:54:52.702206787Z github.com/openshift/vsphere-problem-detector/pkg/check.(*CollectNodeCBT).CheckNode(0xc00127f040?, 0xc0014ae780, 0xc000d5f508, 0xc000c42008)
2026-01-02T07:54:52.702206787Z  github.com/openshift/vsphere-problem-detector/pkg/check/node_cbt.go:53 +0x3d8
2026-01-02T07:54:52.702206787Z github.com/openshift/vsphere-problem-detector/pkg/operator.runSingleNodeSingleCheck(0xc0014ae780, 0xc0016f1840, 0xc000d5f508, 0xc000c42008, {0x336dc50, 0x4daf080})

Actual results:

    The Go runtime detects unsafe concurrent map access and terminates the process, leading to a pod restart.

Expected results:

The pod should not crash with the unsafe concurrent map.

Additional info:

    Will Upload the must-gather and share the details soon

blocks

OCPBUGS-74706 [4.21] vsphere-problem-detector-operator crash due to concurrent map writes

Closed

is cloned by

OCPBUGS-74706 [4.21] vsphere-problem-detector-operator crash due to concurrent map writes

Closed

links to

openshift/vsphere-problem-detector#206: OCPBUGS-70365: fix concurrent map writes

Assignee:: Richard Hrmo

Reporter:: Divyam Pateriya

Need Info From:: None

Contributors:: None

QA Contact:: Rahul Deore

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2026/01/06 2:27 PM

Updated:: 2026/02/02 9:10 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates