Uploaded image for project: 'Performance and Scale for AI Platforms'
  1. Performance and Scale for AI Platforms
  2. PSAP-491

Update openshift-control-plane profile for RHEL9 based RHCOS

    XMLWordPrintable

Details

    • openshift-control-plane profiles review
    • False
    • False
    • To Do
    • 100
    • 100% 100%
    • Undefined

    Description

      OCP/Telco Definition of Done
      Epic Template descriptions and documentation.

      Epic Goal

        • RHEL QE has done testing without the sched_ tunables on RHEL9 only, and found no performance impact.
      • Determine a set of relevant tunables which impact control-plane performance, and their ranges.
      • Determine a set of workloads for testing control plane performance.
      • Use the kruize/autotune project to automate the optimization of the set of relevant tunables for control plane performance.
      • Conduct large scale testing, ideally with the collaboration by the Perf&Scale team with/without the `sched_` tunables present in the openshift-control-plane profile, compare the results and adjust the profile for future releases accordingly.
      • Provide an updated documentation on the profiles used in the OpenShift docs.

      Why is this important?

      • The changes that worked well on RHCOS/RHEL 7.x systems might be affect performance of the current RHCOS/RHEL 8.x / 9.x systems.
      • Want to keep consistency between profiles on RHEL and RHCOS for similar workloads.
      • Potential to develop a system for automating review of tuned profiles going forward

      Scenarios

      1. ...

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • ...

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      1. RHEL 9.x testing performed by the BaseOS engineering and QE teams.

      Open questions::

      1. Testing strategy to evaluate the performance impact of on the RHCOS/RHEL nodes.  Ideally, a typical set (or at least a subset) of tests run by the Perf&Scale team.  We need to focus on tests results of which the performance of the control plane particularly influences.

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

      Attachments

        Issue Links

          Activity

            People

              dagray@redhat.com David Gray
              jmencak Jiri Mencak
              Liquan Cui Liquan Cui
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: