Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-146584

[RHEL-10.2][ARM]: Unable to Check the mem prefetched size on Guest

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • rhel-10.2
    • None
    • qemu-kvm
    • None
    • Important
    • rhel-virt-core
    • 25
    • 26
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      What were you trying to do that didn't work?

      1. After attaching the GPU device to the guest vm , tried checking for memory prefetched size and was unable to get the output.
      2. Even though the PCI Bus mem range is 512G, it fails to map three specific BARs (Base Address Registers) for the device at PCI address 08:00.0 and below error is observed in serial log: 

      [   11.007352] pci 0000:08:00.0: [10de:2342] type 00 class 0x030200 PCIe Endpoint
      2026-02-04 23:21:05: [   11.008390] pci 0000:08:00.0: BAR 0 [mem 0x00000000-0x00ffffff 64bit pref]
      2026-02-04 23:21:05: [   11.009018] pci 0000:08:00.0: BAR 2 [mem 0x00000000-0x7fffffff 64bit pref]
      2026-02-04 23:21:05: [   11.009514] pci 0000:08:00.0: BAR 4 [mem 0x00000000-0x1fffffffff 64bit pref]
      2026-02-04 23:21:05: [   11.010034] pci 0000:08:00.0: Max Payload Size set to 128 (was 256, max 256)                                      2026-02-04 23:21:05: [   11.010612] pci 0000:08:00.0: Enabling HDA controller                                                             2026-02-04 23:21:05: [   11.011461] pci 0000:08:00.0: 15.753 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x1 link at 0000:00:01.7 (capable of 31.507 Gb/s with 32.0 GT/s PCIe x1 link)
      2026-02-04 23:21:05: [   11.012987] pci 0000:08:00.0: BAR 4 [mem size 0x2000000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.013589] pci 0000:08:00.0: BAR 4 [mem size 0x2000000000 64bit pref]: failed to assign                          2026-02-04 23:21:05: [   11.014144] pci 0000:08:00.0: BAR 2 [mem size 0x80000000 64bit pref]: can't assign; no space                      2026-02-04 23:21:05: [   11.014743] pci 0000:08:00.0: BAR 2 [mem size 0x80000000 64bit pref]: failed to assign
      2026-02-04 23:21:05: [   11.017348] pci 0000:08:00.0: BAR 0 [mem size 0x01000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.017979] pci 0000:08:00.0: BAR 0 [mem size 0x01000000 64bit pref]: failed to assign
      2026-02-04 23:21:05: [   11.019481] pci 0000:08:00.0: BAR 4 [mem size 0x2000000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.020374] pci 0000:08:00.0: BAR 4 [mem size 0x2000000000 64bit pref]: failed to assign
      2026-02-04 23:21:05: [   11.020940] pci 0000:08:00.0: BAR 2 [mem size 0x80000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.020942] pci 0000:08:00.0: BAR 2 [mem size 0x80000000 64bit pref]: failed to assign
      2026-02-04 23:21:05: [   11.020944] pci 0000:08:00.0: BAR 0 [mem size 0x01000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.020945] pci 0000:08:00.0: BAR 0 [mem size 0x01000000 64bit pref]: failed to assign
       

       

      3. If we reboot the guest then we are able to see the prefetchable memory information. 

      [root@localhost ~]# lspci -vv -s 08:00.0 | grep -i prefetchable
      Region 0: Memory at a080000000 (64-bit, prefetchable) [size=16M]	
      Region 2: Memory at a000000000 (64-bit, prefetchable) [size=2G]	
      Region 4: Memory at 8000000000 (64-bit, prefetchable) [size=128G]
      [root@localhost ~]# 
       

      Please provide the package NVR for which the bug is seen:

      64k Host name:   nvidia-grace-hopper-08.khw.eng.bos2.dc.redhat.com
      64k Host Kernel: 6.12.0-195.el10.aarch64+64k
      QEMU:  qemu-kvm-10.1.0-12.el10

      libvirt_version': libvirt-11.10.0-4.el10.aarch64
      edk2: edk2-aarch64-20251114-2.el10.noarch
      64k Guest Kernel: 6.12.0-195.el10.aarch64+64k

      How reproducible is this bug?:

      100%

      Steps to reproduce

      1. Test case - https://polarion.engineering.redhat.com/polarion/#/project/RHELVIRT/workitem?id=VIRT-304413
      2. Start a vm with highmem-mmio-size pci feature
      3. Add below gpu device for 512G and 1T:
      <hostdev mode="subsystem" type="pci" managed="yes">
        <source>
          <address domain="0x0009" bus="0x01" slot="0x00" function="0x0"/>
        </source>
      </hostdev>
      1. Get gpu device's pci in host:
        lspci |grep 3D
      0009:01:00.0 3D controller: NVIDIA Corporation GH100 [GH200 120GB / 480GB] (rev a1)

      Check the mem prefetched size in host first:

      1. lspci -vs 0009:01:00.0
      0009:01:00.0 3D controller: NVIDIA Corporation GH100 [GH200 120GB / 480GB] (rev a1)
      Subsystem: NVIDIA Corporation Device 1809
      Physical Slot: 255-1
      Flags: fast devsel, IRQ 248, NUMA node 0, IOMMU group 16
      Memory at 661002000000 (64-bit, prefetchable) [disabled] [size=16M]
      Memory at 662000000000 (64-bit, prefetchable) [disabled] [size=128G]
      Memory at 661000000000 (64-bit, prefetchable) [disabled] [size=32M]
      1. Check PCI Bus mem range in guest and check if its set to 512G
      root@localhost:~# awk '/PCI ECAM/, /PCI Bus 0000:00/ {print $0}' /proc/iomem
      4010000000-401fffffff : PCI ECAM
      8000000000-ffffffffff : PCI Bus 0000:00 
      1. . Check if the gpu's device got the mem in guest
        The memory region will be fetched without error if the needed size is less than the available

      1. Get gpu device's pci in guest

      1. lspci |grep 3D
      08:00.0 3D controller: NVIDIA Corporation GH100 [GH200 120GB / 480GB] (rev a1)

      2. Check Memory size
      root@localhost:~# lspci -vs 08:00.0

      08:00.0 3D controller: NVIDIA Corporation GH100 [GH200 120GB / 480GB] (rev a1)
      Subsystem: NVIDIA Corporation Device 1809
      Physical Slot: 0-7
      Flags: fast devsel, IRQ 46
      Memory at 14000000000 (64-bit, prefetchable) [size=256G]
      Memory at 18000000000 (64-bit, prefetchable) [size=256G]
      Memory at 1c000000000 (64-bit, prefetchable) [size=256G]

      3. Check dmesg

      [  776.055821] pci 0000:08:00.0: Max Payload Size set to 128 (was 256, max 256)
      [  776.056455] pci 0000:08:00.0: Enabling HDA controller
      [  776.056937] pci 0000:08:00.0: BAR 0 [mem 0x00000000-0x00ffffff 64bit pref]: requesting alignment to 0x4000000000
      [  776.057629] pci 0000:08:00.0: BAR 2 [mem 0x00000000-0x1fffffffff 64bit pref]: requesting alignment to 0x4000000000
      [  776.058331] pci 0000:08:00.0: BAR 4 [mem 0x00000000-0x01ffffff 64bit pref]: requesting alignment to 0x4000000000
      [  776.059424] pci 0000:08:00.0: 15.753 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x1 link at 0000:00:01.7 (capable of 31.507 Gb/s with 32.0 GT/s PCIe x1 link)
      [  776.060834] pci 0000:08:00.0: BAR 0 [mem 0x14000000000-0x17fffffffff 64bit pref]: assigned
      [  776.061506] pci 0000:08:00.0: BAR 2 [mem 0x18000000000-0x1bfffffffff 64bit pref]: assigned
      [  776.062163] pci 0000:08:00.0: BAR 4 [mem 0x1c000000000-0x1ffffffffff 64bit pref]: assigned

      Expected results

      1. Check Memory size
        root@localhost:~# lspci -vs 08:00.0
      08:00.0 3D controller: NVIDIA Corporation GH100 [GH200 120GB / 480GB] (rev a1)
      Subsystem: NVIDIA Corporation Device 1809
      Physical Slot: 0-7
      Flags: fast devsel, IRQ 46
      Memory at 14000000000 (64-bit, prefetchable) [size=256G]
      Memory at 18000000000 (64-bit, prefetchable) [size=256G]
      Memory at 1c000000000 (64-bit, prefetchable) [size=256G]

      3. Check dmesg

      [  776.055821] pci 0000:08:00.0: Max Payload Size set to 128 (was 256, max 256)
      [  776.056455] pci 0000:08:00.0: Enabling HDA controller
      [  776.056937] pci 0000:08:00.0: BAR 0 [mem 0x00000000-0x00ffffff 64bit pref]: requesting alignment to 0x4000000000
      [  776.057629] pci 0000:08:00.0: BAR 2 [mem 0x00000000-0x1fffffffff 64bit pref]: requesting alignment to 0x4000000000
      [  776.058331] pci 0000:08:00.0: BAR 4 [mem 0x00000000-0x01ffffff 64bit pref]: requesting alignment to 0x4000000000
      [  776.059424] pci 0000:08:00.0: 15.753 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x1 link at 0000:00:01.7 (capable of 31.507 Gb/s with 32.0 GT/s PCIe x1 link)
      [  776.060834] pci 0000:08:00.0: BAR 0 [mem 0x14000000000-0x17fffffffff 64bit pref]: assigned
      [  776.061506] pci 0000:08:00.0: BAR 2 [mem 0x18000000000-0x1bfffffffff 64bit pref]: assigned
      [  776.062163] pci 0000:08:00.0: BAR 4 [mem 0x1c000000000-0x1ffffffffff 64bit pref]: assigned

      Actual results

      1. No output obtained for 

      1. root@localhost:~# lspci -vs 08:00.0

      2. In the dmesg 

      11.007352] pci 0000:08:00.0: [10de:2342] type 00 class 0x030200 PCIe Endpoint
      2026-02-04 23:21:05: [   11.008390] pci 0000:08:00.0: BAR 0 [mem 0x00000000-0x00ffffff 64bit pref]
      2026-02-04 23:21:05: [   11.009018] pci 0000:08:00.0: BAR 2 [mem 0x00000000-0x7fffffff 64bit pref]
      2026-02-04 23:21:05: [   11.009514] pci 0000:08:00.0: BAR 4 [mem 0x00000000-0x1fffffffff 64bit pref]
      2026-02-04 23:21:05: [   11.010034] pci 0000:08:00.0: Max Payload Size set to 128 (was 256, max 256)                                      2026-02-04 23:21:05: [   11.010612] pci 0000:08:00.0: Enabling HDA controller                                                             2026-02-04 23:21:05: [   11.011461] pci 0000:08:00.0: 15.753 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x1 link at 0000:00:01.7 (capable of 31.507 Gb/s with 32.0 GT/s PCIe x1 link)
      2026-02-04 23:21:05: [   11.012987] pci 0000:08:00.0: BAR 4 [mem size 0x2000000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.013589] pci 0000:08:00.0: BAR 4 [mem size 0x2000000000 64bit pref]: failed to assign                          2026-02-04 23:21:05: [   11.014144] pci 0000:08:00.0: BAR 2 [mem size 0x80000000 64bit pref]: can't assign; no space                      2026-02-04 23:21:05: [   11.014743] pci 0000:08:00.0: BAR 2 [mem size 0x80000000 64bit pref]: failed to assign
      2026-02-04 23:21:05: [   11.017348] pci 0000:08:00.0: BAR 0 [mem size 0x01000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.017979] pci 0000:08:00.0: BAR 0 [mem size 0x01000000 64bit pref]: failed to assign
      2026-02-04 23:21:05: [   11.019481] pci 0000:08:00.0: BAR 4 [mem size 0x2000000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.020374] pci 0000:08:00.0: BAR 4 [mem size 0x2000000000 64bit pref]: failed to assign
      2026-02-04 23:21:05: [   11.020940] pci 0000:08:00.0: BAR 2 [mem size 0x80000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.020942] pci 0000:08:00.0: BAR 2 [mem size 0x80000000 64bit pref]: failed to assign
      2026-02-04 23:21:05: [   11.020944] pci 0000:08:00.0: BAR 0 [mem size 0x01000000 64bit pref]: can't assign; no space
      2026-02-04 23:21:05: [   11.020945] pci 0000:08:00.0: BAR 0 [mem size 0x01000000 64bit pref]: failed to assign
       

              eauger Eric Auger
              rh-ee-meshetty Meghana Shetty
              virt-maint virt-maint
              virt-bugs virt-bugs
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: