-
Epic
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
Deploy long-running RHOAI instance
-
False
-
False
-
Done
-
0% To Do, 0% In Progress, 100% Done
-
-
Epic Goal
- Deploy a long-running RHOAI (with GPUs) instance on ROSA that will be available for the team to use for development and testing of AI software templates (
DEVHAS-643,DEVHAS-666)
Why is this important?
- The team does not currently have access to a long running RHOAI instance, and would benefit from having one
- Issues in
DEVHAS-643require access to such an environment - Need access to an environment with GPUs for some of the child items in
DEVHAS-643
Scenarios
- ...
Acceptance Criteria (Mandatory)
- ROSA cluster created with the following configuration: 3 nodes (2 GPU), HCP
- m5.2xlarge for non-GPU nodes, g5.2xlarge for GPU nodes
- GPU nodes available on the ROSA cluster, with GPU operator installed and configured
- RHOAI installed and configured
- Access configured for the team
- Either via rover group or GitHub teams
Dependencies (internal and external)
- ...
Previous Work (Optional):
- …
Open questions::
- …
Done Checklist
- Acceptance criteria are met
- Non-functional properties of the Feature have been validated (such as performance, resource, UX, security or privacy aspects)
- User Journey automation is delivered
- Support and SRE teams are provided with enough skills to support the feature in production environment