-
Story
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
-
-
Add package 'gpustat' into the RHAI pipeline onboarding collection.
The package requires builder repository onboarding before it can be added to the RHAI pipeline. This ticket is blocked by the builder onboarding ticket.
Summary
Here is the executive summary formatted as a JIRA comment in JIRA wiki markup:
Executive Summary: gpustat Packaging Analysis
gpustat is a pure-Python CLI utility for monitoring NVIDIA GPUs, rated Simple (1/10) for build complexity. The package produces a universal py3-none-any wheel with no native compilation required — building is as straightforward as running pip wheel --no-deps gpustat==1.1.1. The license is MIT, fully compatible with Red Hat redistribution policies, and all transitive dependency licenses (BSD, BSD-3-Clause, MIT) are equally clear. There are no build blockers, no build warnings, and no custom toolchain requirements. The recommended source is the PyPI sdist for v1.1.1, the latest stable release.
The primary consideration for onboarding is runtime dependency and hardware scope management, not build complexity. gpustat depends on nvidia-ml-py (pure-Python NVML bindings, BSD), psutil (C extensions, but pre-built manylinux wheels available, BSD-3-Clause), and blessed (pure Python, MIT). At runtime, the host must provide libnvidia-ml.so via the NVIDIA driver (R450+). On CUDA indexes, gpustat is fully functional. On CPU and ROCm indexes, the package installs without error but provides no functionality — it exits with code 1 and reports no devices. AMD GPU support does not exist (an unmerged PR is open but inactive).
The only notable runtime issue is a driver-specific bug affecting NVIDIA driver series 535.43–535.98, where gpustat reports only the first process per GPU (GitHub #161). The fix is to either upgrade the driver to >=535.98 or pin nvidia-ml-py>=12.535.108 (which the upstream master branch already does but has not yet released). No other issues affect build or packaging. If gpustat proves unsuitable for multi-hardware deployments, nvitop (MIT licensed, more feature-rich) is a viable alternative.
Key recommendations:
- Build from PyPI sdist v1.1.1 using a standard source build — no special environment variables or tooling needed
- For driver 535 environments, patch the nvidia-ml-py pin to >=12.535.108
- Accept that gpustat is NVIDIA-only; it will be a no-op on CPU and ROCm hardware indexes
- No container-specific build requirements — libnvidia-ml.so is accessed at runtime from the host via the GPU operator
- blocks
-
AIPCC-10812 Onboard gpustat into the AIPCC Builder
-
- Closed
-
- mentioned on