Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-21638

Integration of GPU metrics exporter deployment

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • VAF
    • None
    • Integration of GPU metrics exporter deployment
    • True
    • Hide

      Waiting on legal department to weigh on possibility to include the exporter in our product.

      Show
      Waiting on legal department to weigh on possibility to include the exporter in our product.
    • False
    • RHOSSTRAT-1074Extension of edpm-ansible with GPU specific software
    • Not Selected
    • ?
    • ?
    • In Progress
    • RHOSSTRAT-1074 - Extension of edpm-ansible with GPU specific software
    • ?
    • rhos-workloads-vaf
    • ?
    • 86% To Do, 14% In Progress, 0% Done

      Goal:

      • To provide Ansible playbook (and)or role in edpm-ansible for deployment of GPU metrics exporter

      Acceptance Criteria:

      • Patch containing Ansible playbook/role for the exporter deployment is part of downstream edpm-ansible

      Open question:

      • Is the utilization metrics suitable for Watcher needs?
      • Can we downstream libnvidia-ml?
        • If not - is it okay if we install it from nvidia's public repo?

       

      dcgm-exporter runs in a container, but requires libnvidia-ml and container toolkit RPMs installed on the host (EDPM node). The container toolkit is responsible for mapping driver and management libraries into the container at runtime to provide access to the hardware from inside the container.

              csibbitt-rh Chris Sibbitt
              mmagr@redhat.com Martin Magr
              rhos-workloads-vaf
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: