Loading...

XML

Word

Printable

Type: Bug
Resolution: Not a Bug
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: Model Serving
Labels:
- groomed
- mlserving

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Acceptance Criteria:
None
Affects Testing:

Testable
Automated:
No
CDW blocker:
CDW devel_ack:
CDW docs_ack:
CDW pm_ack:
CDW qa_ack:
CDW release:
Regression:
No
Target Release:

FUTURE_GA
Test Blocker:
No
Test Coverage:

Pending
Watchlist Impact:
None
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

Follow up / related to: https://issues.redhat.com/browse/RHODS-8796

If a user deploys a model with Model Serving and requests/forces GPU usage, the metrics reported by the cluster appear to show that the GPU is not being used and instead inference is performed on the CPU:

These screenshots were taken after about ~6k requests were made to the inference endpoint, but I've since tested up to 20k requests and the results remain the same.

spryor@redhat.com thinks it could be due to the fact that the mnist model used for this test is small enough that GPU utilization does not get reported through the OpenShift metrics, but when we tried deploying a bigger model (yolo) we were not able to do so. If anyone has a better gauge of GPU utilization or a model that can be used to confirm these findings it would be extremely helpful.

Prerequisites (if any, like setup, operators/versions):

RHODS 1.27 RC

Steps to Reproduce

Provision GPU node
Install Nvidia GPU Add-On
Deploy a model in a model serving runtime that uses GPUs (i.e. GPU requested through dashboard and due to ~~RHODS-8796~~ force flag added manually to ServingRuntime spec)
Send requests to the inference endpoint (~thousands)
Monitor GPU usage (e.g. DCGM_FI_DEV_GPU_UTIL metric)

Actual results:

GPU usage is pinned at 0

Expected results:

GPU usage increases while inference requests are processed

Reproducibility (Always/Intermittent/Only Once):

Always

Build Details:

Workaround:

Additional info:

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

image (7).png
219 kB
2023/05/18 3:34 PM
image (9).png
173 kB
2023/05/18 3:34 PM
modelmesh-serving-ovms-test-f84cbff78-jbpwg-ovms.log
180 kB
2023/05/18 4:17 PM

Assignee:: Unassigned

Reporter:: Luca Giorgi

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2023/05/18 3:39 PM

Updated:: 2023/06/01 4:30 PM

Resolved:: 2023/06/01 4:30 PM

Details

Description

Description of problem:

Prerequisites (if any, like setup, operators/versions):

Steps to Reproduce

Actual results:

Expected results:

Reproducibility (Always/Intermittent/Only Once):

Build Details:

Workaround:

Additional info:

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates