-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
-
?
-
None
-
-
-
-
Critical
Setup:
- All VMs are volume booted
- Cinder backend
tripleo_svf_fc/volume_backend_name:
value: tripleo_svf_fc
tripleo_svf_fc/volume_driver:
value: cinder.volume.drivers.ibm.storwize_svc.storwize_svc_fc.StorwizeSVCFCDriver
tripleo_svf_fc/storwize_svc_connection_protocol:
value: FC
3. I/O testing was performed in a variety of ways:
-
- File copy
- diskspd.exe
- winsat disk
- another linux tool was also used in RHEL (don't know the name of it)
- Example:
- diskspd.exe -c2G -d120 -w50 -o32 -t8 -b1M -Sh -L -si E:\temp\testfile.dat
winsat disk -drive c
time cp /tmp/40gb.dmp /temp1/.
Measure-Command { Copy-Item -Path "C:\SourceFolder\File.txt" -Destination "D:\DestinationFolder" }
To Reproduce Steps to reproduce the behavior:
- Customer migrates VMs from legacy vmware to OSP 17.1.x
- Disk I/O performance tests executed directly on the HW node using mapped volumes works as expected and to customer satisfaction
- Execute performance tests with VMs running on this compute and using the mapped volumes.
Expected behavior
- Throughput rate with OSP 17.1.x should be comparable to the throughput rates in the legacy vmware environment. Customer experiences close to a 50% drop in throughput rate!
Screenshots
- Attached Image
Device Info (please complete the following information):
- Hardware Specs: [e.g. Apple M2 Pro Chip, 16 GB Memory, etc.]
- OS Version: OSP 17.1.9
- InstructLab Version: [output of \\\\\{{{}ilab --version{}}}]
- Provide the output of these two commands:
- sudo bootc status --format json | jq .status.booted.image.image.image to print the name and tag of the bootc image, should look like registry.stage.redhat.io/rhelai1/bootc-intel-rhel9:1.3-1732894187
- ilab system info to print detailed information about InstructLab version, OS, and hardware – including GPU / AI accelerator hardware
Bug impact
- End customers complaining that their application response times are too slow and almost unusable
Known workaround
- None.
Additional context
- Application team did some IO performance tests to compare BEFORE and AFTER the Kvm migration and they noticed bad read performance (Disk I/O) (See Attachment)
Considerations:
Used IOzone tool installed in each server and ran it locally to capture the metrics by filesystem.
Columns from B to H are metrics captured BEFORE the migration on Aug 8th
Columns from I to O are metrics capture AFTER the migration on Aug 15th (Solr Spark servers) and 21st (for WAS server)
Ignore the z in the front of each server (that was used for sorting purposes)
This issue is impacting our application and data refresh performance.
Do you have any idea why READ is under performing and how can it be im
- …