Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-20737

Applications become very slow after migrating to kvm

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • horizon-operator
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • None
    • Critical

      Setup:

      1. All VMs are volume booted
      2. Cinder backend 
        tripleo_svf_fc/volume_backend_name:
        value: tripleo_svf_fc
        tripleo_svf_fc/volume_driver:
        value: cinder.volume.drivers.ibm.storwize_svc.storwize_svc_fc.StorwizeSVCFCDriver
        tripleo_svf_fc/storwize_svc_connection_protocol:
        value: FC

                        3. I/O testing was performed in a variety of ways:

        • File copy
        • diskspd.exe
        • winsat disk
        • another linux tool was also used in RHEL (don't know the name of it)
        • Example:
        • diskspd.exe -c2G -d120 -w50 -o32 -t8 -b1M -Sh -L -si E:\temp\testfile.dat
          winsat disk -drive c
          time cp /tmp/40gb.dmp /temp1/.
          Measure-Command { Copy-Item -Path "C:\SourceFolder\File.txt" -Destination "D:\DestinationFolder" }

      To Reproduce Steps to reproduce the behavior:

      1. Customer migrates VMs from legacy vmware to OSP 17.1.x
      2. Disk I/O performance tests executed directly on the HW node using mapped volumes works as expected and to customer satisfaction
      3. Execute performance tests with VMs running on this compute and using the mapped volumes.

      Expected behavior

      • Throughput rate with OSP 17.1.x should be comparable to the throughput rates in the legacy vmware environment. Customer experiences close to a 50% drop in throughput rate!

      Screenshots

      • Attached Image

      Device Info (please complete the following information):

      • Hardware Specs: [e.g. Apple M2 Pro Chip, 16 GB Memory, etc.]
      • OS Version: OSP 17.1.9
      • InstructLab Version: [output of \\\\\{{{}ilab --version{}}}]
      • Provide the output of these two commands:
        • sudo bootc status --format json | jq .status.booted.image.image.image to print the name and tag of the bootc image, should look like registry.stage.redhat.io/rhelai1/bootc-intel-rhel9:1.3-1732894187
        • ilab system info to print detailed information about InstructLab version, OS, and hardware – including GPU / AI accelerator hardware

      Bug impact

      • End customers complaining that their application response times are too slow and almost unusable

      Known workaround

      • None.

      Additional context

      • Application team did some IO performance tests to compare BEFORE and AFTER the Kvm migration and they noticed bad read performance (Disk I/O) (See Attachment)
        Considerations:
        Used IOzone tool installed in each server and ran it locally to capture the metrics by filesystem.
        Columns from B to H are metrics captured BEFORE the migration on Aug 8th
        Columns from I to O are metrics capture AFTER the migration on Aug 15th (Solr Spark servers) and 21st (for WAS server)
        Ignore the z in the front of each server (that was used for sorting purposes)
        This issue is impacting our application and data refresh performance.
        Do you have any idea why READ is under performing and how can it be im

        1. io_perf_tests_20250815_test_env.xlsx
          248 kB
          Raj Varadarajan
        2. Drawing1.vsdx
          73 kB
          Raj Varadarajan
        3. baremetal_18s_40g.png
          76 kB
          Raj Varadarajan
        4. VM_100s_40g.png
          406 kB
          Raj Varadarajan

              Unassigned Unassigned
              rhn-support-rvaradar Raj Varadarajan
              rhos-dfg-ui
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: