Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: Hosted Control Planes
Labels:
- cee.neXT

Target Version:
None
Activity Type:
Product / Portfolio Work
Status Summary:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Products:
None
Hierarchy Progress Bar:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Impact Score:
PX Impact Range:
None
PX Priority Data:
None
PX Technical Impact:
None
PX Technical Impact Notes:
None
PX Scheduling Request:
None

1. Proposed title of this feature request

Make Kubernetes API client QPS and Burst configurable for KubeVirt CSI infra cluster client to resolve volume attach throttling

2. What is the nature and description of the request?

Currently, the KubeVirt CSI driver’s controller uses a Kubernetes client to interact with the infra (management) cluster for hotplug operations (handling DataVolumes, VirtualMachineInstances, etc.). This client is initialized using rest.Config without explicit overrides, falling back to the client-go defaults of 5 QPS and 10 Burst.

In high-concurrency scenarios (e.g., 10+ concurrent PVC attachments), the driver issues multiple API calls per request. This quickly exhausts the default token bucket, causing:

Client-side throttling: Logs indicate Waited for Xms due to client-side throttling.

gRPC Timeouts: Requests block until they exceed the csi-attacher context deadline (typically 120s), returning a failure to the sidecar.

Exponential Delays: Retries by the csi-attacher result in total "Pod-to-Ready" times of 4–6+ minutes, even when the underlying infrastructure is idle.

Requested Change:

Expose Configuration: Add command-line flags (e.g., --infra-kube-api-qps and --infra-kube-api-burst) or environment variables to the CSI controller.

Raise Default Limits: Increase the hardcoded defaults for the infra-cluster client (suggested: 20 QPS / 50 Burst) to align with modern CSI driver standards (like AWS EBS or GCP PD).

3. Why does the customer need this? (List the business requirements here)
Reduced Provisioning Latency: In HyperShift (HCP) environments, customers expect rapid scaling. A 5-minute wait for a volume to attach is a significant regression in UX and application availability.

Scalability for Stateful Workloads: Modern CI/CD pipelines and database clusters often trigger "burst" schedules where many pods start simultaneously. The CSI driver must be able to burst its API communication to match this load.

Operational Observability & Tuning: Platform engineers need the ability to tune the driver’s "aggressiveness" based on the size of their management cluster. Without these flags, the only way to fix throttling is by patching the driver binary.

Ecosystem Parity: Providing these flags aligns KubeVirt CSI with the operational patterns of other major CSI drivers, making it "production-ready" for enterprise scale.

4. List any affected packages or components.

Kubevirt CSI driver

Assignee:: Adam Litke

Reporter:: Divyam Pateriya

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2026/02/16 1:41 PM

Updated:: 2026/02/17 9:40 PM

Target start:: None

Target end:: None

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates