-
Feature
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
Proposed title of this feature request
Collect accelerator metrics in OCP
What is the nature and description of the request?
With the rise of OpenShift AI, there's a need to collect metrics about accelerator cards (including but not limited to GPUs). It should require no to little configuration from the customers and we recommend deploying a custom text collector with node_exporter.
Why does the customer need this? (List the business requirements)
Display inventory data about accelerators in the OCP admin console (like we do for CPU, memory, ... in the Overview page).
Better understanding of which accelerators are used (Telemetry requirement).
List any affected packages or components.
node_exporter
CMO