Loading...

XML

Word

Printable

Type: Task
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: Model Serving
Labels:
- groomed
- modelserving

Story Points:
3
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Acceptance Criteria:
None
Affects Testing:

Testable
Automated:
No
Regression:
No
Test Blocker:
No
Test Coverage:

Pending
Watchlist Impact:
None
Intelligence Requested:
Market:

Sprint:
ML Serving Sprint 1.29

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

HPA feature of modelmesh supports CPU/MEMORY based only. For better support, we need more metrics for autoscaler.

So we should research more options such as custom metrics or Custom Metrics Autoscaler Operator.

With this ticket, it is expected to research options.

slack thread: https://redhat-internal.slack.com/archives/C04FSLYLDQ8/p1685061377990309

David's idea is the following:

Here are my ideas that I know we can get from prometheus metrics:
 - average latency of modelmesh requests
 - average requests per minute
 - gpu utilization
and others that may be possible, but I don't know all the modelmesh metrics currently available:
  - queue length at runtime or modelmesh request level

Reference:

"Custom Metrics Autoscaler Operator"

Assignee:: Vedant Mahabaleshwarkar

Reporter:: JOOHO LEE

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2023/05/26 4:46 PM

Updated:: 2025/11/13 4:12 PM

Resolved:: 2025/11/13 4:12 PM