XML

Word

Printable

Type: Epic
Resolution: Obsolete
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: RHODS
Labels:
- AI/ML
- RHODS

Epic Name:
Performance and scalability of watsonx prompt tuning
Workstream:

Inference, RHOAI
Color Status:
Not Selected
Ready:
False
Blocked:
False
Blocked Reason:
None

Epic Goal

Test the performance and scalability of prompt tuned models served via the watsonx stack
Ensure stability with many (thousands) of users each sending queries to prompt tuned model instances

Why is this important?

As raised by Daniele, depending on the architecture, prompt tuning may result in large numbers of CRs / models / Pods... so the scalability of the architecture and the relevant controllers must be tested. We should also keep an eye on control plane load.

Scenarios

...

Acceptance Criteria

...

Dependencies (internal and external)

...

Previous Work (Optional):

…

Open questions::

When will this feature be enabled, and how can it be used / tested
What are the requirements / expectations in terms of # users, namespaces, models / requests per minute

is related to

PSAP-1112 Performance and Scale testing for RHOAI releases with KServe stack

Closed

Assignee:: Unassigned

Reporter:: David Gray

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2023/07/07 7:21 PM

Updated:: 2024/11/11 10:00 PM

Resolved:: 2024/11/11 10:00 PM