Loading...

Linking RHIVOS CVEs to...

Migration: Automation ...

RHELPRIO AssignedTeam ...

SWIFT: POC Conversion

Sync from "Extern...

XML

Word

Printable

Type: Story
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: rteval
Labels:
None

Severity:
None
Epic Link:
RHEL-66059
sprint_count:
1

AssignedTeam:
rhel-kernel-rts-time

Story Points:
0
Blocked:
False
Ready:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
None
Sprint:
CK Parent Issues In Progress

Preliminary Testing:
None
Test Coverage:
None

ProdDocsReview-CCS:
Unspecified
ProdDocsReview-Dev:
Unspecified
ProdDocsReview-QE:
Unspecified

Experience:

PX Impact Score:
SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

Description:
Runs on Top of Application Profiles
System runner is designed to be application-agnostic. This enables both the extension of additional test suite stacks into this orchestration layer, and allows custom profiles for end-user application stacks. With this approach, multiple test platforms can be deployed, and users can deploy their application on the same topology stack that was measured and characterized.
rteval-runner will serve as the initial profile for developing system-runner.

Background:
System-runner is being implemented to enable a powerful orchestration layer for modern real-time, isolation, and scalability studies across bare metal, Podman, and Kubernetes/OCP environments. The tool should provide advanced options for CPU/core partitioning, container topology, and dynamic system scaling to support a broad range of experimental workloads.

Objective:
Build system-runner with support for:

Advanced partitioning and topology control (per-container/per-core assignments)

“Partitioned” vs. “non-partitioned” workload definitions

A percent-based system scaling option (“scaling knob”) for dynamic control of system resource allocation

Application profile based deployment for advanced topology management, both for simulations and real applications.
Support for measuring and comparing both weak (intra-pod) isolation and strict (inter-pod or standalone) isolation:

- The tool should allow users to deploy and benchmark scenarios where multiple containers share a pod (and thus, a cgroup), as well as scenarios where each workload is isolated in its own cgroup (standalone container or one workload per pod).

- This enables direct measurement and research on the effects of resource sharing and noisy neighbor phenomena within pods versus strong partitioning and cgroup isolation.

Requirements:

Partitioning and Topology Options:

Allow configuration for both:

- Single-container runs on a specific CPU range

- Multiple containers, each pinned to a defined number of cores

Support “partitioned” (dedicated CPUs for load/measurement) and “non-partitioned” modes

Let users specify:

- container_run_type: single or all

- cpu_range

- cores_per_container

- partitioning: partitioned or nonpartitioned

- use_tuna: true/false (for CPU isolation/affinity)

Percent-Based System Scaling:

Implement a config/CLI option (e.g., system_scale_percent) to control what percentage of the host’s total CPUs/cores are used for the experiment.

All resource allocation (container count, cores per container, affinity) should respect this scaling parameter.

Example: On a 40-core host, system_scale_percent: 25 uses 10 cores for the workload.

Automated Run Directory and Iteration Management:

Each experiment run should auto-generate a directory reflecting the run mode, topology, scaling, and iteration count.

Store config, logs, and all results for traceability and reproducibility.

Dependency and Resource Checks:

On startup, check for required host tools (podman, tuna, etc.) and clearly report or fail if missing.

Dynamic Orchestration Logic:

Dynamically determine core sets, container counts, and launch containers or processes with correct CPU and memory assignments based on the config and scaling.

Partition CPU for measurement and load as specified.

Support both partitioned and non-partitioned measurement, with or without tuna.

Extensible Config Schema & CLI:

Update config schema to include all partitioning and scaling options.

CLI flags should allow overrides or direct specification at runtime.

Documentation & Examples:

Document all new features and provide clear example configs and result directory structures for each major mode.

Acceptance Criteria:

User can fully specify partitioning, scaling, and core assignments in a config file or via CLI.

system-runner creates the correct run directories and manages resource allocation per the chosen options.

All artifacts and configs are stored per-run.

Required dependencies are checked at runtime with clear messaging.

At least one complete doc/example set is included.

Notes:
This will enable flexible, dynamic, and reproducible experiment orchestration for real-time and partitioning studies, closing a key gap in the research and testing toolchain. This tool is not intended to replace tuned or the node tuning operator. Its meant to operate along sideit.

split to

RHEL-124725 Same: Finish Implementing basic Topology Placement strategy

RHEL-101817 Improve how configs are passed to containers in rteval-runner, making non-pod container workflows more user-friendly.

Closed

RHEL-101818 Refactor rteval-runner to Separate Runner Framework from rteval Logic

Closed

RHEL-102633 Implement System Runner Backends: taskset, podman, kubeplay

Closed

RHEL-104932 Framework for Run Sequence YAML Generation from Base Template

Closed

RHEL-104936 Implement Basic Command-Line Interface (CLI) Framework

Closed

RHEL-107612 [Design Work] Refactor Topology Management and Per-CPU Allocation in rteval-runner for Scalable, Profile-Based Simulations

Closed

RHEL-124724 Small Runner Changes

Closed

(3 split to)

Assignee:: William White

Reporter:: William White

Developer:: William White

QA Contact:: Waylon Cude

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2025/07/01 3:17 PM

Updated:: 2025/10/28 10:53 PM

Stale Date:: 2026/06/30

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates