-
Bug
-
Resolution: Done
-
Undefined
-
None
-
4.12
-
None
-
None
-
False
-
-
-
Done
Description of problem:
Fail to deploy IPI azure cluster, where set region as westus3, vm type as NV8as_v4. Master node is running from azure portal, but could not ssh login. From serials log, get below error:
[ 3009.547219] amdgpu d1ef:00:00.0: amdgpu: failed to write reg:de0 [ 3011.982399] mlx5_core 6637:00:02.0 enP26167s1: TX timeout detected [ 3011.987010] mlx5_core 6637:00:02.0 enP26167s1: TX timeout on queue: 0, SQ: 0x170, CQ: 0x84d, SQ Cons: 0x823 SQ Prod: 0x840, usecs since last trans: 2418884000 [ 3011.996946] mlx5_core 6637:00:02.0 enP26167s1: TX timeout on queue: 1, SQ: 0x175, CQ: 0x852, SQ Cons: 0x248c SQ Prod: 0x24a7, usecs since last trans: 2148366000 [ 3012.006980] mlx5_core 6637:00:02.0 enP26167s1: TX timeout on queue: 2, SQ: 0x17a, CQ: 0x857, SQ Cons: 0x44a1 SQ Prod: 0x44c0, usecs since last trans: 2055000000 [ 3012.016936] mlx5_core 6637:00:02.0 enP26167s1: TX timeout on queue: 3, SQ: 0x17f, CQ: 0x85c, SQ Cons: 0x405f SQ Prod: 0x4081, usecs since last trans: 1913890000 [ 3012.026954] mlx5_core 6637:00:02.0 enP26167s1: TX timeout on queue: 4, SQ: 0x184, CQ: 0x861, SQ Cons: 0x39f2 SQ Prod: 0x3a11, usecs since last trans: 2020978000 [ 3012.037208] mlx5_core 6637:00:02.0 enP26167s1: TX timeout on queue: 5, SQ: 0x189, CQ: 0x866, SQ Cons: 0x1784 SQ Prod: 0x17a6, usecs since last trans: 2185513000 [ 3012.047178] mlx5_core 6637:00:02.0 enP26167s1: TX timeout on queue: 6, SQ: 0x18e, CQ: 0x86b, SQ Cons: 0x4c96 SQ Prod: 0x4cb3, usecs since last trans: 2124353000 [ 3012.056893] mlx5_core 6637:00:02.0 enP26167s1: TX timeout on queue: 7, SQ: 0x193, CQ: 0x870, SQ Cons: 0x3bec SQ Prod: 0x3c0f, usecs since last trans: 1855857000 [ 3021.535888] amdgpu d1ef:00:00.0: amdgpu: failed to write reg:e15 [ 3021.545955] BUG: unable to handle kernel paging request at ffffb57b90159000 [ 3021.550864] PGD 100145067 P4D 100145067 PUD 100146067 PMD 0
From azure doc https://learn.microsoft.com/en-us/azure/virtual-machines/nvv4-series , looks like nvv4 series only supports Window VM.
Version-Release number of selected component (if applicable):
4.12 nightly build
How reproducible:
Always
Steps to Reproduce:
1. prepare install-config.yaml, set region as westus3, vm type as NV8as_v4 2. install cluster 3.
Actual results:
installation failed
Expected results:
If nvv4 series is not supported for Linux VM, installer might validate and show the message that such size is not supported.
Additional info:
- blocks
-
OCPBUGS-5992 [azure] Fail to create master node with vm size in standardNVSv4Family
- Closed
- is cloned by
-
OCPBUGS-5992 [azure] Fail to create master node with vm size in standardNVSv4Family
- Closed
- links to