-
Bug
-
Resolution: Done
-
Undefined
-
rhelai-1.5
-
None
-
False
-
-
False
-
-
To Reproduce Steps to reproduce the behavior:
- Run `ilab config init` on unrecognized hardware
- Select "NVIDIA"
- Observe the misnamed profile for #11
Please choose a system profile. Profiles set hardware-specific defaults for all commands and sections of the configuration. First, please select the hardware vendor your system falls into [0] NO SYSTEM PROFILE [1] NVIDIA Enter the number of your choice [0]: 1 You selected: NVIDIA Next, please select the specific hardware configuration that most closely matches your system. [0] NO SYSTEM PROFILE [1] NVIDIA L4 X8 [2] NVIDIA L40S X4 [3] NVIDIA L40S X8 [4] NVIDIA H100 X4 [5] NVIDIA H100 X2 [6] NVIDIA H100 X8 [7] NVIDIA A100 X4 [8] NVIDIA A100 X2 [9] NVIDIA A100 X8 [10] NVIDIA H200 X8 [11] NVIDIA H100 X8 [12] NVIDIA H200 X2 [13] NVIDIA H200 X1 Enter the number of your choice [hit enter for hardware defaults] [0]
4. View .local/share/instructlab/internal/system_profiles/nvidia/h200/h200_x4.yaml . You will see that the metadata section is incorrect, while every other section seems correct:
metadata: gpu_manufacturer: Nvidia gpu_family: H100 gpu_count: 8 gpu_sku: [NVL, PCIe]
Device Info (please complete the following information):
- Hardware Specs: An AWS or EC2 instance with hardware not recognized. Such as dual NVIDIA L40S (IBM Cloud: gx3-48x240x2l40s)
- OS Version: RHEL AI staging 1.5-5
- InstructLab Version: 0.26.0
- registry.stage.redhat.io/rhelai1/bootc-nvidia-rhel9:1.5
Bug impact
- Clearly visible typo whenever a user has to select their hardware.
- If a user has H200 X4 and must use the selector, they are likely to not select it.
- Whether or not H200 X4 actually works or not has not been tested.