-
Feature Request
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
Product / Portfolio Work
-
None
-
False
-
-
None
-
None
-
None
-
-
None
-
None
-
None
-
None
-
None
1. Proposed title of this feature request
Make Kubernetes API client QPS and Burst configurable for KubeVirt CSI infra cluster client to resolve volume attach throttling
2. What is the nature and description of the request?
Currently, the KubeVirt CSI driver’s controller uses a Kubernetes client to interact with the infra (management) cluster for hotplug operations (handling DataVolumes, VirtualMachineInstances, etc.). This client is initialized using rest.Config without explicit overrides, falling back to the client-go defaults of 5 QPS and 10 Burst.
In high-concurrency scenarios (e.g., 10+ concurrent PVC attachments), the driver issues multiple API calls per request. This quickly exhausts the default token bucket, causing:
Client-side throttling: Logs indicate Waited for Xms due to client-side throttling.
gRPC Timeouts: Requests block until they exceed the csi-attacher context deadline (typically 120s), returning a failure to the sidecar.
Exponential Delays: Retries by the csi-attacher result in total "Pod-to-Ready" times of 4–6+ minutes, even when the underlying infrastructure is idle.
Requested Change:
Expose Configuration: Add command-line flags (e.g., --infra-kube-api-qps and --infra-kube-api-burst) or environment variables to the CSI controller.
Raise Default Limits: Increase the hardcoded defaults for the infra-cluster client (suggested: 20 QPS / 50 Burst) to align with modern CSI driver standards (like AWS EBS or GCP PD).
3. Why does the customer need this? (List the business requirements here)
Reduced Provisioning Latency: In HyperShift (HCP) environments, customers expect rapid scaling. A 5-minute wait for a volume to attach is a significant regression in UX and application availability.
Scalability for Stateful Workloads: Modern CI/CD pipelines and database clusters often trigger "burst" schedules where many pods start simultaneously. The CSI driver must be able to burst its API communication to match this load.
Operational Observability & Tuning: Platform engineers need the ability to tune the driver’s "aggressiveness" based on the size of their management cluster. Without these flags, the only way to fix throttling is by patching the driver binary.
Ecosystem Parity: Providing these flags aligns KubeVirt CSI with the operational patterns of other major CSI drivers, making it "production-ready" for enterprise scale.
4. List any affected packages or components.
Kubevirt CSI driver