Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: Hosted Control Planes
Labels:
- cee.next_proposed
- kubevirt

Target Version:
None
Activity Type:
Product / Portfolio Work
Status Summary:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Products:
None
Hierarchy Progress Bar:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Impact Score:
PX Impact Range:
None
PX Priority Data:
None
PX Technical Impact:
None
PX Technical Impact Notes:
None
PX Scheduling Request:
None

1. Proposed title of this feature request

Expose hotplug operation timing (latency/duration) in KubeVirt CSI driver logs and metrics to diagnose volume attach delays.

2. What is the nature and description of the request?

Nature: Observability Enhancement (Logging and Prometheus Metrics).

Description: Currently, the KubeVirt CSI driver orchestrates the hotplug flow during ControllerPublishVolume by attaching a DataVolume to a VirtualMachineInstance (VMI) and polling until the volume is "Ready." However, there is no structured timing data exposed during this sequence. Operators cannot distinguish between time spent on API Rate Limiting, KubeVirt Control Plane reconciliation, or Backend Storage staging.

Requested Change:

Structured Observability Logs: Inject timestamped log lines at critical handoff points in pkg/service/controller.go or equivalent:

Hotplug Start: When the patch/update to the VMI spec is initiated.
Wait Initiation: When the driver enters the polling loop for volumeStatus.
Hotplug Success: When the volume phase reaches Ready in the VMI status.
Timeout/Failure: Explicitly log the elapsed duration if the 120s (or configured) deadline is hit.

Prometheus: Introduce a new histogram metric:

kubevirt_csi_hotplug_duration_seconds: Labeled by storage_class and phase (Total, API_Update, VMI_Wait).

3. Why does the customer need this? (List the business requirements here)

Mean Time to Detection (MTTD): In environments like Hypershift (HCP), a 6-minute+ attachment delay is currently a "black box." Operators cannot see if the delay is due to the CSI driver being throttled (API level) or the virt-handler being slow to plug the device (Node level).

Support Efficiency: Provides a clear "paper trail" in logs to resolve disputes between Storage Vendors (backend latency) and Platform Teams (KubeVirt latency).

4. List any affected packages or components.

HCP KubeVirt CSI Driver

Assignee:: Peter Lauterbach

Reporter:: Divyam Pateriya

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2026/02/18 4:34 PM

Updated:: 2026/02/18 9:51 PM

Target start:: None

Target end:: None

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates