-
Feature
-
Resolution: Unresolved
-
Major
-
None
-
None
-
Product / Portfolio Work
-
None
-
0% To Do, 100% In Progress, 0% Done
-
False
-
None
-
False
-
M
-
None
-
None
-
None
-
-
None
-
None
-
None
-
None
Feature Overview (aka. Goal Summary)
Add support to NVIDIA H100 and H200 enabled machine series to be used on OpenShift deployment in Azure
Goals (aka. expected user outcomes)
Support OpenShift to be deployed in Azure in the following machine series:
- ND-H100-v5
- Standard_ND96isr_H100_v5
- NCads_H100_v5
- Standard_NC40ads_H100_v5
- Standard_NC80adis_H100_v5
- NCCads_H100_v5
- Standard_NCC40ads_H100_v5
- ND-H200-v5
- Standard_ND96isr_H200_v5
Requirements (aka. Acceptance Criteria):
All these machine series can be selected at install time to be used to deploy OpenShift on Azure
Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed. Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.
Deployment considerations | List applicable specific needs (N/A = not applicable) |
Self-managed, managed, or both | |
Classic (standalone cluster) | |
Hosted control planes | |
Multi node, Compact (three node), or Single node (SNO), or all | |
Connected / Restricted Network | |
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) | |
Operator compatibility | |
Backport needed (list applicable versions) | |
UI need (e.g. OpenShift Console, dynamic plugin, OCM) | |
Other (please specify) |
Background
Customers demand to run AI-enabled workloads in the cloud keeps increasing. To be able to support our customers we need to enable the latest GPUs available in the market
Documentation Considerations
Usual documentation to list these machine series as tested
Interoperability Considerations
This feature will be consumed by ARO later
- blocks
-
OCPSTRAT-2177 Azure - Support confidential GPUs within Confidential Clusters
-
- New
-
- is triggered by
-
RFE-7093 Request to add support for H100 and H200 GPU instances in Azure Red Hat OpenShift (ARO)
-
- Approved
-