Uploaded image for project: 'Performance and Scale for AI Platforms'
  1. Performance and Scale for AI Platforms
  2. PSAP-652

Run Phoronix tests on OpenShift

XMLWordPrintable

    • Phoronix test suite on OpenShift
    • False
    • False
    • 0% To Do, 0% In Progress, 100% Done

      OCP/Telco Definition of Done
      Epic Template descriptions and documentation.

      <--- Cut-n-Paste the entire contents of this description into your new Epic --->

      Epic Goal

      • The goal of this epic is to "lower the barrier of entry" for running the phoronix test suites in OpenShift

      Why is this important?

      • When I look at the variety of tests in Phoronix (450+ test profiles, 100+ test suites, extensibility) and the fact is has the ability for recording and archiving the test results, it just feels to me another great tool in our arsenal for getting a ton of interesting performance data to reason about our systems or in this case an OpenShift cluster under test.

      Scenarios

      1. Heterogeneous clusters -  With OpenShift adopting the heterogeneous cluster theme for CY22 (each node in the cluster can be of different architecture) - there seems to be value in being able to assess the broad performance capabilities of each node independently and potentially use that information for better workloads scheduling decisions either manually or through the scheduler.
      2. Autotuning - one of the goals for the node level autotuning initiative is to come up with better performing tuned profiles for specific scenarios. One approach we were thinking of was to take a very specific microbenchmark and use that in the autotuning framework (optimize some objective function for that microbenchmark) to come up with the optimized kernel level tunables. The other idea could be to optimize the kernel level tunables for a specific Phoronix test suite (or multiple of them) instead of a specific microbenchmark which might lead to tuned profiles that can target a broader set of workloads
      3. AI/ML/HPC - Phoronix seems has a number of AI/ML/HPC tests that can give us good insights into single node CPU/GPU performance for these workloads.

      Acceptance Criteria

      • Demo of phoronix running on OpenShift
      • A blog describing the efforts
      • open source repo storing the final deliverables 
      • Delivery of a phoronix container image on quay.io

      Dependencies (internal and external)

      1. https://www.phoronix-test-suite.com/

      Previous Work (Optional):

      1. https://off-by-one.dev/benchmarking-on-kubernetes/

      Open questions::

      1.  

      Done Checklist

      •  

              jmencak Jiri Mencak
              akamra8979 Ashish Kamra
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: