Loading...

XML

Word

Printable

Type: Story
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Model Validation
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Link:
GPUaaS Technology Research & Hands-On Evaluation
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Goal{}

Build a clear and well-documented understanding of how GPU resources are managed on IBM Cloud today.

Description{}

This story focuses on discovery and documentation of GPU resource management practices on IBM Cloud, through direct coordination with the IBM Cloud team.

The work is not assumed to be technical access or deployment.

It is expected that the primary activity will be one or more working sessions with the IBM Cloud team to understand their current architecture, tooling, and operational model.

This story must be executed by at least two participants together, to ensure shared understanding, reduce single-point interpretation, and improve the quality of the output.

Primary contact: Kieran Forde

Topics to be covered include:

How GPU resources are provisioned and managed on IBM Cloud
The high-level architecture used for GPU allocation and scheduling
What abstractions or services exist for GPU consumers
How GPU usage and utilization are tracked
Whether dashboards or observability tools exist, and what visibility they provide
How resources are allocated behind the scenes
How quotas, priorities, and fairness are handled
Whether preemption is supported, and under what conditions
Whether GPU partitioning mechanisms such as MIG are used

The goal is to document the as-is state, not to evaluate or compare solutions at this stage.

Out of scope{}

Deploying workloads on IBM Cloud
Running GPU benchmarks or stress tests
Comparing IBM Cloud to other GPUaaS candidates

DoD{}

A meeting (or series of meetings) with the IBM Cloud team is completed
At least two team members participated in the sessions
A written summary document exists that includes:
A bullet-point list of available GPU management features
A high-level architecture overview
How GPU resources are allocated, shared, and reclaimed
How prioritization, quotas, preemption, and MIG (if applicable) are handled
What dashboards or visibility exist for usage and utilization

The document is shared with the team and can be directly referenced in the GPUaaS evaluation and comparison phase.

Assignee:: Wesley Spinks

Reporter:: Aviran Badli

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2026/02/09 9:00 AM

Updated:: 2026/02/12 8:34 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty

Hide