• Icon: Epic Epic
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • Deploy long-running RHOAI instance
    • False
    • False
    • Done
    • 0% To Do, 0% In Progress, 100% Done

      Epic Goal

      • Deploy a long-running RHOAI (with GPUs) instance on ROSA that will be available for the team to use for development and testing of AI software templates (DEVHAS-643, DEVHAS-666)

      Why is this important?

      • The team does not currently have access to a long running RHOAI instance, and would benefit from having one
      • Issues in DEVHAS-643 require access to such an environment
      • Need access to an environment with GPUs for some of the child items in DEVHAS-643

      Scenarios

      1. ...

      Acceptance Criteria (Mandatory)

      • ROSA cluster created with the following configuration: 3 nodes (2 GPU), HCP
        • m5.2xlarge for non-GPU nodes, g5.2xlarge for GPU nodes
      • GPU nodes available on the ROSA cluster, with GPU operator installed and configured
      • RHOAI installed and configured
      • Access configured for the team
        • Either via rover group or GitHub teams

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      Open questions::

      •  

      Done Checklist

      • Acceptance criteria are met
      • Non-functional properties of the Feature have been validated (such as performance, resource, UX, security or privacy aspects)
      • User Journey automation is delivered
      • Support and SRE teams are provided with enough skills to support the feature in production environment

              Unassigned Unassigned
              johnmcollier John Collier
              RHIDP - AI
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: