Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-114415

Overall guest numa policy override specific strict policy of the memory device

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • No
    • Moderate
    • rhel-virt-core-libvirt-1
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      What were you trying to do that didn't work?

      A guest with overall guest numa policy with below:
        <numatune>
          <memory mode="preferred" nodeset="1" />
        </numatune>
      And there is a memory device which define the source like below:
          <source>
            <pagesize unit='KiB'>1048576</pagesize>
            <nodemask>0</nodemask>
          </source>
      After the guest start up, the memory device numa policy is override to "preferred" not "bind" (strict), and I also observe the hugepage memory allocation drift to other numa node sometimes, but not always. 

      Please provide the package NVR for which the bug is seen:

      rhel9.7:

      # rpm -q libvirt qemu-kvm
      libvirt-10.10.0-15.el9.x86_64
      qemu-kvm-9.1.0-26.el9.x86_64

      rhel10.1:

       

      # rpm -q libvirt qemu-kvm
      libvirt-11.5.0-4.el10.x86_64
      qemu-kvm-10.0.0-12.el10.x86_64

       

      How reproducible is this bug?: 100%

      Steps to reproduce

      1. on a multi numa node host:

      # numactl --hard
      available: 2 nodes (0-1)
      node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
      node 0 size: 31560 MB
      node 0 free: 27664 MB
      node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
      node 1 size: 32199 MB
      node 1 free: 26162 MB
      node distances:
      node     0    1 
         0:   10   21 
         1:   21   10 

      2. prepare the hugepage env.

       

      # echo 2 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
      # echo 2 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
      # mkdir /dev/hugepages1048576
      # mount -t hugetlbfs -o pagesize=1048576K hugetlbfs /dev/hugepages1048576
      # systemctl restart virtqemud
      

       

      2. Start the guest with below config xml:
        <maxMemory slots="16" unit="KiB">15243264</maxMemory>
        <memory unit="KiB">3145728</memory>
        <currentMemory unit="KiB">3145728</currentMemory>
         <vcpu placement="static">4</vcpu>
        <numatune>
          <memory mode="preferred" nodeset="1" />
        </numatune>
        <cpu mode='host-model' check='partial'>
          <numa>
            <cell id="0" cpus="0-1" memory="1048576" unit="KiB" />
            <cell id="1" cpus="2-3" memory="1048576" unit="KiB" />
          </numa>
        </cpu>
        ...
        <memory model='dimm'>
          <source>
            <pagesize unit='KiB'>1048576</pagesize>
            <nodemask>0</nodemask>
          </source>
          <target>
            <size unit='KiB'>1048576</size>
            <node>0</node>
        </target>
        </memory>
        ...

      # virsh start avocado-vt-vm1
      Domain 'avocado-vt-vm1' started

      3. Check the qemu cmd line, memory device numa policy is same with guest "preferred".

      # ps -ef | grep qemu-kvm
      -object {"qom-type":"thread-context","id":"tc-memdimm0","node-affinity":[0]} -object {"qom-type":"memory-backend-file","id":"memdimm0","mem-path":"/dev/hugepages1048576/libvirt/qemu/5-avocado-vt-vm1","prealloc":true,"size":1073741824,"host-nodes":[0],"policy":"preferred","prealloc-context":"tc-memdimm0"} -device {"driver":"pc-dimm","node":0,"memdev":"memdimm0","id":"dimm0","slot":0}

      3. Check the hugepage allocation, memory device 1G hugepage memory are allocated on node0.

       

      # virsh freepages --all
      Node 0:
      4KiB: 6936642
      2048KiB: 0
      1048576KiB: 1
      Node 1:
      4KiB: 7526007
      2048KiB: 0
      1048576KiB: 2
      

       

      4. After the guest boot up and wait for some time (around 2mins) check hugepage allocation, the memory device hugepage memory drift to node 1 (not always)

       

      # virsh freepages --all
      Node 0:
      4KiB: 6937610
      2048KiB: 0
      1048576KiB: 2
      Node 1:
      4KiB: 7326549
      2048KiB: 0
      1048576KiB: 1
      

       

      Expected results:

      1. memory device numa policy should be "bind" (strict) not overall policy "preferred";
      2. the hugepage used by memory device should always be allocated on node 0;

      Actual results:

      1. memory device numa policy is overide by overall policy "preferred";
      2. the node 0 hugepage used by memory device drift to node 1 after the guest boot up;

              virt-maint virt-maint
              lcong@redhat.com Liang Cong
              virt-maint virt-maint
              Liang Cong Liang Cong
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: