Uploaded image for project: 'OpenStack Strategy'
  1. OpenStack Strategy
  2. RHOSSTRAT-655

BRQ2 cluster for validated architecture

XMLWordPrintable

    • Icon: Feature Feature
    • Resolution: Done
    • Icon: Major Major
    • rhos-18.0.10 FR 3
    • None
    • internal
    • None
    • Not Selected
    • False
    • False
    • Hide

      None

      Show
      None
    • M
    • 5
    • 0
    • 7
    • 2.333
    • 2

      Feature Overview
      This feature covers the creation of a small-scale, validated Red Hat OpenStack Platform (RHOSP) 18 cluster. This cluster will be specifically designed to simulate a production environment for the purpose of testing and validating hardware accelerators (like GPUs) for AI/ML workloads. The initial deployment will be in the BRQ2 location and will focus on validating the end-to-end customer experience.

      Goals

      • Deploy a functional, stable, small-scale RHOSP 18 cluster in the BRQ2 lab.
      • Establish a repeatable, automated process for deploying and managing the cluster to ensure consistency.
      • Create environment suitable for testing and benchmarking various AI accelerator hardware.
      • Enable engineering teams to effectively simulate and validate customer AI workload scenarios.

      Requirements
      Hardware Provisioning:

      • Identify and secure sufficient physical servers in the BRQ2 lab to act as control, compute, and storage nodes.
      • Ensure the selected hardware meets the minimum requirements for RHOSP 18 and the specific AI accelerators to be tested.

      Network Configuration:

      • Design and implement the necessary network infrastructure, including VLANs, subnets, and routing.
      • The network must support all required RHOSP traffic types (Control, Internal API, Storage, Tenant, External).

      Deployment Automation:

      • Develop robust automation (e.g., using Ansible) for the bare-metal provisioning and deployment of the complete RHOSP 18 cluster.
      • The automation must be idempotent, configurable, and version-controlled.

      Acceptance Criteria
      [ ] A RHOSP 18 cluster is successfully deployed and fully operational in the BRQ2 location.

      [ ] All core OpenStack services (Keystone, Nova, Neutron, Cinder, Glance) are healthy, accessible via API/CLI, and pass health checks.

      [ ] The entire deployment process is automated. The cluster can be torn down and redeployed from scratch using the automation with minimal manual intervention.
      [ ] At least one model of AI accelerator can be successfully provisioned to a Nova instance (e.g., via PCI-passthrough) and is usable from within the guest OS.

              pkubica@redhat.com Petr Kubica
              lsvaty@redhat.com Lukas Svaty
              Lukas Svaty Lukas Svaty
              rhos-workloads-lightspeed
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: