Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-3672

Cannot enable PCIe Resizable BAR on Intel ARC GPU [DG2]

    • None
    • Impediment
    • Low
    • rhel-sst-arch-hw
    • ssg_platform_enablement
    • 13
    • Dev nak
    • True
    • Hide

      Dev nak

      Show
      Dev nak
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:

      I have an Intel DG2 in the following configuration:

      [0000:5d]00.0[5e-61]---00.0[5f-61]--+01.0[60]----00.0 Intel Corporation DG2 [Arc A380]
      -04.0-[61]----00.0 Intel Corporation DG2 Audio Controller

      /proc/iomem reports the following resources:

      b8800000-c5ffffff : PCI Bus 0000:5d
      b9000000-ba0fffff : PCI Bus 0000:5e
      b9000000-ba0fffff : PCI Bus 0000:5f
      b9000000-b9ffffff : PCI Bus 0000:60
      b9000000-b9ffffff : 0000:60:00.0
      ba000000-ba0fffff : PCI Bus 0000:61
      ba000000-ba003fff : 0000:61:00.0

      b000000000-bfffffffff : PCI Bus 0000:5d
      bfe0000000-bff07fffff : PCI Bus 0000:5e
      bfe0000000-bfefffffff : PCI Bus 0000:5f
      bfe0000000-bfefffffff : PCI Bus 0000:60
      bfe0000000-bfefffffff : 0000:60:00.0
      bff0000000-bff07fffff : 0000:5e:00.0

      From dmesg, resource assignments:

      PCI host bridge to bus 0000:5d
      pci_bus 0000:5d: root bus resource [io 0x8000-0x9fff window]
      pci_bus 0000:5d: root bus resource [mem 0xb8800000-0xc5ffffff window]
      pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window]
      pci_bus 0000:5d: root bus resource [bus 5d-7f]

      pci 0000:5e:00.0: [8086:4fa1] type 01 class 0x060400
      pci 0000:5e:00.0: reg 0x10: [mem 0xbff0000000-0xbff07fffff 64bit pref]

      pci 0000:5d:00.0: PCI bridge to [bus 5e-61]
      pci 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff]
      pci 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]

      pci 0000:5e:00.0: PCI bridge to [bus 5f-61]
      pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff]
      pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]

      pci 0000:60:00.0: [8086:56a5] type 00 class 0x030000
      pci 0000:60:00.0: reg 0x10: [mem 0xb9000000-0xb9ffffff 64bit]
      pci 0000:60:00.0: reg 0x18: [mem 0xbfe0000000-0xbfefffffff 64bit pref]
      pci 0000:60:00.0: reg 0x30: [mem 0xffe00000-0xffffffff pref]

      pci 0000:5f:01.0: PCI bridge to [bus 60]
      pci 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff]
      pci 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]

      pci 0000:61:00.0: [8086:4f92] type 00 class 0x040300
      pci 0000:61:00.0: reg 0x10: [mem 0xba000000-0xba003fff 64bit]

      pci 0000:5f:04.0: PCI bridge to [bus 61]
      pci 0000:5f:04.0: bridge window [mem 0xba000000-0xba0fffff]

      And finally, lspci:

      0000:5d:00.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port A (rev 07) (prog-if 00 [Normal decode])
      Bus: primary=5d, secondary=5e, subordinate=61, sec-latency=0
      I/O behind bridge: 0000f000-00000fff [disabled]
      Memory behind bridge: b9000000-ba0fffff [size=17M]
      Prefetchable memory behind bridge: 000000bfe0000000-000000bff07fffff [size=264M]

      0000:5e:00.0 PCI bridge: Intel Corporation Device 4fa1 (rev 01) (prog-if 00 [Normal decode])
      Region 0: Memory at bff0000000 (64-bit, prefetchable) [size=8M]
      Bus: primary=5e, secondary=5f, subordinate=61, sec-latency=0
      I/O behind bridge: 0000f000-00000fff [disabled]
      Memory behind bridge: b9000000-ba0fffff [size=17M]
      Prefetchable memory behind bridge: 000000bfe0000000-000000bfefffffff [size=256M]

      0000:5f:01.0 PCI bridge: Intel Corporation Device 4fa4 (prog-if 00 [Normal decode])
      Bus: primary=5f, secondary=60, subordinate=60, sec-latency=0
      I/O behind bridge: 0000f000-00000fff [disabled]
      Memory behind bridge: b9000000-b9ffffff [size=16M]

      0000:5f:04.0 PCI bridge: Intel Corporation Device 4fa4 (prog-if 00 [Normal decode])
      Bus: primary=5f, secondary=61, subordinate=61, sec-latency=0
      I/O behind bridge: 0000f000-00000fff [disabled]
      Memory behind bridge: ba000000-ba0fffff [size=1M]
      Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled]

      0000:60:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A380] (rev 05) (prog-if 00 [VGA controller])
      Region 0: Memory at b9000000 (64-bit, non-prefetchable) [disabled] [size=16M]
      Region 2: Memory at bfe0000000 (64-bit, prefetchable) [disabled] [size=256M]
      Expansion ROM at <ignored> [disabled]
      ...
      Capabilities: [420 v1] Physical Resizable BAR
      BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB

      0000:61:00.0 Audio device: Intel Corporation DG2 Audio Controller
      Region 0: Memory at ba000000 (64-bit, non-prefetchable) [size=16K]

      From iomem and dmesg, we can see that the root port at 0000:5d:00.0 has a 64GB [0xb000000000-0xbfffffffff] 64-bit, prefetchable aperture available to it. Only 264MB of that aperture is programmed via the BIOS (system does not support BIOS enabled ReBAR).

      Of that 264MB, 8MB is allocated to the upstream switch port at 5e:00.0, the remaining 256MB is available in the downstream aperture of this bridge and is allocated to the DG2 GPU at 60:00.0.

      Therefore, in order to make use of PCIe Resizable BARs, not only does the PCI subsystem need to release the GPU resources, but it also needs to release the upstream switch BAR resources. Linux currently refuses to do this:

      1. cat /sys/bus/pci/devices/0000\:60\:00.0/resource2_resize
        0000000000003f00
      2. echo 9 > /sys/bus/pci/devices/0000\:60\:00.0/resource2_resize
        -bash: echo: write error: No space left on device

      dmesg reports:

      pci 0000:60:00.0: BAR 2: releasing [mem 0xbfe0000000-0xbfefffffff 64bit pref]
      pcieport 0000:5f:01.0: BAR 15: releasing [mem 0xbfe0000000-0xbfefffffff 64bit pref]
      pcieport 0000:5e:00.0: BAR 15: releasing [mem 0xbfe0000000-0xbfefffffff 64bit pref]
      pcieport 0000:5e:00.0: BAR 15: no space for [mem size 0x20000000 64bit pref]
      pcieport 0000:5e:00.0: BAR 15: failed to assign [mem size 0x20000000 64bit pref]
      pcieport 0000:5f:01.0: BAR 15: no space for [mem size 0x20000000 64bit pref]
      pcieport 0000:5f:01.0: BAR 15: failed to assign [mem size 0x20000000 64bit pref]
      pci 0000:60:00.0: BAR 2: no space for [mem size 0x20000000 64bit pref]
      pci 0000:60:00.0: BAR 2: failed to assign [mem size 0x20000000 64bit pref]
      pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61]
      pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff]
      pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]
      pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61]
      pcieport 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff]
      pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]
      pcieport 0000:5f:01.0: PCI bridge to [bus 60]
      pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff]
      pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]
      pci 0000:60:00.0: BAR 2: assigned [mem 0xbfe0000000-0xbfefffffff 64bit pref]

      Meanwhile, an AMD GPU in the same system can trivially be resized:

      [0000:b0]00.0[b1-b3]---00.0[b2-b3]---00.0[b3]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon Pro W5700]
      +-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 HDMI Audio
      +-00.2 Advanced Micro Devices, Inc. [AMD/ATI] Device 7316
      -00.3 Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 USB

      PCI host bridge to bus 0000:b0
      pci_bus 0000:b0: root bus resource [io 0xc000-0xdfff window]
      pci_bus 0000:b0: root bus resource [mem 0xe1000000-0xee7fffff window]
      pci_bus 0000:b0: root bus resource [mem 0xe000000000-0xefffffffff window]
      pci_bus 0000:b0: root bus resource [bus b0-d7]

      pci 0000:b1:00.0: [1002:1478] type 01 class 0x060400
      pci 0000:b1:00.0: reg 0x10: [mem 0xe1200000-0xe1203fff]

      pci 0000:b0:00.0: PCI bridge to [bus b1-b3]
      pci 0000:b0:00.0: bridge window [io 0xc000-0xcfff]
      pci 0000:b0:00.0: bridge window [mem 0xe1000000-0xe12fffff]
      pci 0000:b0:00.0: bridge window [mem 0xefe0000000-0xeff01fffff 64bit pref]

      pci 0000:b1:00.0: PCI bridge to [bus b2-b3]
      pci 0000:b1:00.0: bridge window [io 0xc000-0xcfff]
      pci 0000:b1:00.0: bridge window [mem 0xe1000000-0xe11fffff]
      pci 0000:b1:00.0: bridge window [mem 0xefe0000000-0xeff01fffff 64bit pref]

      pci 0000:b3:00.0: [1002:7312] type 00 class 0x030000
      pci 0000:b3:00.0: reg 0x10: [mem 0xefe0000000-0xefefffffff 64bit pref]
      pci 0000:b3:00.0: reg 0x18: [mem 0xeff0000000-0xeff01fffff 64bit pref]
      pci 0000:b3:00.0: reg 0x20: [io 0xc000-0xc0ff]
      pci 0000:b3:00.0: reg 0x24: [mem 0xe1100000-0xe117ffff]
      pci 0000:b3:00.0: reg 0x30: [mem 0xfffe0000-0xffffffff pref]

      pci 0000:b3:00.1: [1002:ab38] type 00 class 0x040300
      pci 0000:b3:00.1: reg 0x10: [mem 0xe1184000-0xe1187fff]

      pci 0000:b3:00.2: [1002:7316] type 00 class 0x0c0330
      pci 0000:b3:00.2: reg 0x10: [mem 0xe1000000-0xe10fffff 64bit]

      pci 0000:b3:00.3: [1002:7314] type 00 class 0x0c8000
      pci 0000:b3:00.3: reg 0x10: [mem 0xe1180000-0xe1183fff 64bit]

      pci 0000:b2:00.0: PCI bridge to [bus b3]
      pci 0000:b2:00.0: bridge window [io 0xc000-0xcfff]
      pci 0000:b2:00.0: bridge window [mem 0xe1000000-0xe11fffff]
      pci 0000:b2:00.0: bridge window [mem 0xefe0000000-0xeff01fffff 64bit pref]

      0000:b0:00.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port A (rev 07) (prog-if 00 [Normal decode])
      Bus: primary=b0, secondary=b1, subordinate=b3, sec-latency=0
      I/O behind bridge: 0000c000-0000cfff [size=4K]
      Memory behind bridge: e1000000-e12fffff [size=3M]
      Prefetchable memory behind bridge: 000000efe0000000-000000eff01fffff [size=258M]

      0000:b1:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (prog-if 00 [Normal decode])
      Region 0: Memory at e1200000 (32-bit, non-prefetchable) [size=16K]
      Bus: primary=b1, secondary=b2, subordinate=b3, sec-latency=0
      I/O behind bridge: 0000c000-0000cfff [size=4K]
      Memory behind bridge: e1000000-e11fffff [size=2M]
      Prefetchable memory behind bridge: 000000efe0000000-000000eff01fffff [size=258M]

      0000:b2:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (prog-if 00 [Normal decode])
      Bus: primary=b2, secondary=b3, subordinate=b3, sec-latency=0
      I/O behind bridge: 0000c000-0000cfff [size=4K]
      Memory behind bridge: e1000000-e11fffff [size=2M]
      Prefetchable memory behind bridge: 000000efe0000000-000000eff01fffff [size=258M]

      0000:b3:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon Pro W5700] (prog-if 00 [VGA controller])
      Region 0: Memory at efe0000000 (64-bit, prefetchable) [disabled] [size=256M]
      Region 2: Memory at eff0000000 (64-bit, prefetchable) [disabled] [size=2M]
      Region 4: I/O ports at c000 [disabled] [size=256]
      Region 5: Memory at e1100000 (32-bit, non-prefetchable) [disabled] [size=512K]
      Expansion ROM at e11a0000 [disabled] [size=128K]
      ...
      Capabilities: [200 v1] Physical Resizable BAR
      BAR 0: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB
      BAR 2: current size: 2MB, supported: 2MB 4MB 8MB 16MB 32MB 64MB 128MB 256MB

      0000:b3:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 HDMI Audio
      Region 0: Memory at e1184000 (32-bit, non-prefetchable) [size=16K]

      0000:b3:00.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 7316 (prog-if 30 [XHCI])
      Region 0: Memory at e1000000 (64-bit, non-prefetchable) [size=1M]

      0000:b3:00.3 Serial bus controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 USB
      Region 0: Memory at e1180000 (64-bit, non-prefetchable) [size=16K]

      The layout is ALMOST identical, but the minor difference is that the upstream switch port BAR at b1:00.0 makes use of the 32-bit, non-prefetchable range. The ONLY consumer of the 64-bit, prefetchable range downstream from the root port is the GPU itself.

      This means Linux will happily make use of PCI Resizable BARs:

      1. cat /sys/bus/pci/devices/0000\:b3\:00.0/resource0_resize
        0000000000003f00
      2. echo 9 > /sys/bus/pci/devices/0000\:b3\:00.0/resource0_resize
      3. echo $?
        0

      dmesg:

      pci 0000:b3:00.0: BAR 0: releasing [mem 0xefe0000000-0xefefffffff 64bit pref]
      pci 0000:b3:00.0: BAR 2: releasing [mem 0xeff0000000-0xeff01fffff 64bit pref]
      pcieport 0000:b2:00.0: BAR 15: releasing [mem 0xefe0000000-0xeff01fffff 64bit pref]
      pcieport 0000:b1:00.0: BAR 15: releasing [mem 0xefe0000000-0xeff01fffff 64bit pref]
      pcieport 0000:b0:00.0: BAR 15: releasing [mem 0xefe0000000-0xeff01fffff 64bit pref]
      pcieport 0000:b0:00.0: BAR 15: assigned [mem 0xe000000000-0xe02fffffff 64bit pref]
      pcieport 0000:b1:00.0: BAR 15: assigned [mem 0xe000000000-0xe02fffffff 64bit pref]
      pcieport 0000:b2:00.0: BAR 15: assigned [mem 0xe000000000-0xe02fffffff 64bit pref]
      pci 0000:b3:00.0: BAR 0: assigned [mem 0xe000000000-0xe01fffffff 64bit pref]
      pci 0000:b3:00.0: BAR 2: assigned [mem 0xe020000000-0xe0201fffff 64bit pref]
      pcieport 0000:b0:00.0: PCI bridge to [bus b1-b3]
      pcieport 0000:b0:00.0: bridge window [io 0xc000-0xcfff]
      pcieport 0000:b0:00.0: bridge window [mem 0xe1000000-0xe12fffff]
      pcieport 0000:b0:00.0: bridge window [mem 0xe000000000-0xe02fffffff 64bit pref]
      pcieport 0000:b1:00.0: PCI bridge to [bus b2-b3]
      pcieport 0000:b1:00.0: bridge window [io 0xc000-0xcfff]
      pcieport 0000:b1:00.0: bridge window [mem 0xe1000000-0xe11fffff]
      pcieport 0000:b1:00.0: bridge window [mem 0xe000000000-0xe02fffffff 64bit pref]
      pcieport 0000:b2:00.0: PCI bridge to [bus b3]
      pcieport 0000:b2:00.0: bridge window [io 0xc000-0xcfff]
      pcieport 0000:b2:00.0: bridge window [mem 0xe1000000-0xe11fffff]
      pcieport 0000:b2:00.0: bridge window [mem 0xe000000000-0xe02fffffff 64bit pref]

      The decision of the Intel DG2 to make use of an upstream switch port with a BAR in the same resource pool as the GPU resizable BAR requires enhancements to the Linux PCI core to be able to handle this scenario.

      Version-Release number of selected component (if applicable):

      The issue is present in the usptream Linux kernel as of v6.3. The above is from 5.14.0-306.el9.x86_64

      How reproducible:
      100%

      Steps to Reproduce:
      1. As outlined above
      2.
      3.

      Actual results:
      Resizable BARs cannot be configured from the OS on Intel DG2 GPUs, but is available for AMD GPUs.

      Expected results:
      Better hardware choices or enhancement to the Linux PCI resource subsystem to manage the upstream switch resource BAR.

      Additional info:
      Do we need a pci=reassign option to move 64-bit, prefetchable resources for non-endpoints to the 32-bit, non-prefetchable address space?

              mstowe@redhat.com Myron Stowe
              alex.williamson@redhat.com alex.williamson@redhat.com (Inactive)
              Myron Stowe Myron Stowe
              William Gomeringer William Gomeringer
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: