-
Task
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
CNV Automation – Week of Feb 18-25, 2026
10K VM Scale Testing
Identified Ceph single-image contention as the root cause of clone deadlocks – all 10K clones target one RBD parent image, and Ceph serializes per-image operations, causing 600+ goroutines to pile up. Applied CSI provisioner throttling to limit concurrent clone calls from 600+ to 10. Test #6 is running stable at ~60% complete (~5,957 VMs running, ~460 DVs/hour) – first successful sustained 10K VM creation run.
kube-burner Cleanup Fix
Found why kube-burner was destroying all VMs when it hit a timeout – a YAML config flag was blocking on old namespace deletion, and a code bug treated normal transient states as fatal errors. Fixed both and contributed 5 PRs across 2 upstream repos:
- cnv-scenarios PR #23 - Output errors/warnings fix (merged) - CNV-79877
- cnv-scenarios PR #27 - Generic namespace cleanup (merged)
- kube-burner PR #1147 - verifyCondition retry fix (submitted upstream) - CNV-79878
- cnv-scenarios PR #28 - Storage defaults (on fork) - CNV-79907
- cnv-scenarios PR #29 - kube-burner 1.15.0 compat (on fork) - CNV-79908
HealthCrew Dashboard (CNV-79980)
- *CNV scenarios report system* – Generates visual HTML reports for kube-burner test runs so results are easy to read without digging through logs.
- *Test progression engine* – Automatically detects problems during a run (24 stop conditions) and suggests the next test configuration to try (~30 rules), replacing manual trial and error.
- *User templates* – Engineers can save, load, and share test configurations so they don't have to reconfigure 30+ parameters every time.
Storage Comparison Report
Built a side-by-side comparison of 5 storage backends (NVMe LSO, Ceph RBD, CephFS, IBM GPFS, NetApp GPFS) – shows which storage performs best for CNV workloads across sequential reads, writes, and random IOPS. Sourced from CNV-75751, Regression 4.21, and Fusion Access GA test results.
DFBUGS-394 Verification
Wrote the step-by-step procedure to verify that ODF storage pods now have higher priority so they won't be evicted during cluster stress – ready for execution on ODF 4.21.