Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-7481

use "-numa cpu" instead of "-numa node,cpus=" in libvirt

    • None
    • Important
    • rhel-sst-virtualization
    • ssg_virtualization
    • 8
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      '-numa node,cpus=' has some problems and will be deprecated by qemu in near future. And '-numa node,nodeid=xx' is recommended by qemu. So libvirt should implement to use '-numa node,nodeid=xx' for numa configuring.

      +++ This bug was initially created as a clone of Bug #1669434 +++

      Description of problem:
      ppc64le use the number of CPUs (-smp) instead of the maxcpus to check the CPU assignement

      Version-Release number of selected component (if applicable):
      qemu-kvm-3.1.0-5.module+el8+2708+fbd828c6.ppc64le
      kernel-4.18.0-59.el8.ppc64le

      How reproducible:
      5/5

      Steps to Reproduce:
      1./usr/libexec/qemu-kvm -enable-kvm -M pseries -cpu host -m 8192 -smp 4,maxcpus=8,cores=2,threads=2,sockets=2 -numa node,mem=4G,cpus=0,cpus=1 -numa node,mem=4G,cpus=3,cpus=4 -numa node -numa node
      2.

      Actual results:
      qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 3 [core-id: 6]
      qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future

      Expected results:
      There is similar output to x86's result.Check the CPU assignment by maxcpus but not -smp xx

      Additional info:
      It will not be reproducible on x86 platform.
      /usr/libexec/qemu-kvm -enable-kvm -M pc -cpu host -m 8192 -smp 4,maxcpus=8,cores=2,threads=2,sockets=2 -numa node,mem=4G,cpus=0,cpus=1 -numa node,mem=4G,cpus=3,cpus=4 -numa node -numa node
      qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 2 [socket-id: 0, core-id: 1, thread-id: 0], CPU 5 [socket-id: 1, core-id: 0, thread-id: 1], CPU 6 [socket-id: 1, core-id: 1, thread-id: 0], CPU 7 [socket-id: 1, core-id: 1, thread-id: 1]
      qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future

      — Additional comment from Min Deng on 2019-01-25 09:16:09 UTC —

      It's ppc only problem.Thanks for Laurent's confirmation.

      — Additional comment from Gu Nini on 2019-01-28 11:28:27 UTC —

      In my test with core-id, I met similar problem, and I think there is another issue besides the one in the bug, i.e. the 'core-id' must be an even number(i.e. 0, 2, 4, 6, 8...) so that it could be counted as the 'CPU' number(i.e. 0, 1, 2, 3, 4...), which is different from the x86 behaviour.

      Scenario I:
      Tried to boot up a guest with 'core-id' in odd number:
      -smp 4,cores=2,threads=2,sockets=1 \
      -numa node,nodeid=0 \
      -numa node,nodeid=1 \
      -numa cpu,node-id=0,core-id=0 \
      -numa cpu,node-id=1,core-id=1 \

      Failed to boot up the guest
      QEMU 3.1.0 monitor - type 'help' for more information
      (qemu) qemu-kvm: -numa cpu,node-id=1,core-id=1: no match found

      Scenario II:
      Tried to boot up a guest with 'maxcpus' and the 'core-id' in even number:
      -smp 4,maxcpus=12,cores=2,threads=2,sockets=3 \
      -numa node,nodeid=0 \
      -numa node,nodeid=1 \
      -numa cpu,node-id=0,core-id=0 \
      -numa cpu,node-id=1,core-id=2 \

      It was a success to boot up the guest, but there was similar warning info as that in the bug(and the 'core-id' corresponding the 'CPU' id is special); besides, you could see the remained cpus(i.e. 4~11) failed to be listed under 'info numa' output, and it could be got the same result from cmd 'numactl -H' output inside the guest:
      QEMU 3.1.0 monitor - type 'help' for more information
      (qemu) qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 2 [core-id: 4], CPU 3 [core-id: 6], CPU 4 [core-id: 8], CPU 5 [core-id: 10]
      qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future

      (qemu)
      (qemu) info hotpluggable-cpus
      Hotpluggable CPUs:
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      CPUInstance Properties:
      node-id: "0"
      core-id: "10"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      CPUInstance Properties:
      node-id: "0"
      core-id: "8"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      CPUInstance Properties:
      node-id: "0"
      core-id: "6"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      CPUInstance Properties:
      node-id: "0"
      core-id: "4"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      qom_path: "/machine/unattached/device[1]"
      CPUInstance Properties:
      node-id: "1"
      core-id: "2"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      qom_path: "/machine/unattached/device[0]"
      CPUInstance Properties:
      node-id: "0"
      core-id: "0"
      (qemu)
      (qemu) info numa
      2 nodes
      node 0 cpus: 0 1
      node 0 size: 10240 MB
      node 0 plugged: 0 MB
      node 1 cpus: 2 3
      node 1 size: 10240 MB
      node 1 plugged: 0 MB
      (qemu)

      The numa topology inside the guest:

      1. numactl -H
        available: 2 nodes (0-1)
        node 0 cpus: 0 1
        node 0 size: 10240 MB
        node 0 free: 8747 MB
        node 1 cpus: 2 3
        node 1 size: 10240 MB
        node 1 free: 9854 MB
        node distances:
        node 0 1
        0: 10 40
        1: 40 10

      — Additional comment from Gu Nini on 2019-01-30 04:48:03 UTC —

      (In reply to Gu Nini from comment #2)
      > In my test with core-id, I met similar problem, and I think there is another
      > issue besides the one in the bug, i.e. the 'core-id' must be an even
      > number(i.e. 0, 2, 4, 6, 8...) so that it could be counted as the 'CPU'
      > number(i.e. 0, 1, 2, 3, 4...), which is different from the x86 behaviour.
      >
      >

      Now I got that the 'core-id' should be a multiple of the 'threads' value specified

      — Additional comment from Laurent Vivier on 2019-01-31 17:23:34 UTC —

      This is regression between qemu v2.9.0 and v2.10.0

      qemu-system-ppc64 -smp 4,maxcpus=8,cores=2,threads=2,sockets=2 -numa node,cpus=0,cpus=1 -numa node,cpus=3,cpus=4 -numa node -numa node

      v2.9.0

      qemu-system-ppc64: warning: CPU(s) not present in any NUMA nodes: 2 5 6 7
      qemu-system-ppc64: warning: All CPU(s) up to maxcpus should be described in NUMA config
      qemu-system-ppc64: Invalid node-id=1 of thread[cpu-index: 3] on CPU[core-id: 2, node-id: 4], node-id must be the same

      v2.10.0

      qemu-system-ppc64: warning: CPU(s) not present in any NUMA nodes: CPU 3 [core-id: 6]
      qemu-system-ppc64: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future

      The regression has been introduced by:

      commit ec78f8114bc4c133fc56fefa7f2af99725e42857
      Author: Igor Mammedov <imammedo@redhat.com>
      Date: Wed May 10 13:29:59 2017 +0200

      numa: use possible_cpus for not mapped CPUs check

      and remove corresponding part in numa.c that uses
      node_cpu bitmaps.

      Signed-off-by: Igor Mammedov <imammedo@redhat.com>
      Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: Andrew Jones <drjones@redhat.com>
      Message-Id: <1494415802-227633-16-git-send-email-imammedo@redhat.com>
      Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

      This is because spapr_possible_cpu_arch_ids() ignore the CPU threads in the enumeration.

      — Additional comment from Laurent Vivier on 2019-01-31 18:03:22 UTC —

      It is interesting to note we also lost the error message about the invalidity of the command line:

      "Invalid node-id=1 of thread[cpu-index: 3] on CPU[core-id: 2, node-id: 4], node-id must be the same"

      Which means all threads of core must be on the same node.

      A valid command line is:

      qemu-system-ppc64 -smp 4,maxcpus=8,cores=2,threads=2,sockets=2 -numa node,cpus=0,cpus=1 -numa node,cpus=2,cpus=3 -numa node -numa nodeqemu-system-ppc64: warning: CPU(s) not present in any NUMA nodes: 4 5 6 7

      v2.9.0
      qemu-system-ppc64: warning: CPU(s) not present in any NUMA nodes: 4 5 6 7
      qemu-system-ppc64: warning: All CPU(s) up to maxcpus should be described in NUMA config
      QEMU 2.9.0 monitor - type 'help' for more information
      (qemu) info numa
      4 nodes
      node 0 cpus: 0 1
      node 0 size: 0 MB
      node 1 cpus: 2 3
      node 1 size: 0 MB
      node 2 cpus:
      node 2 size: 0 MB
      node 3 cpus:
      node 3 size: 512 MB
      (qemu) info hotpluggable-cpus
      Hotpluggable CPUs:
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      CPUInstance Properties:
      core-id: "6"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      CPUInstance Properties:
      core-id: "4"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      qom_path: "/machine/unattached/device[1]"
      CPUInstance Properties:
      core-id: "2"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      qom_path: "/machine/unattached/device[0]"
      CPUInstance Properties:
      core-id: "0"

      v2.12.0

      qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 2 [core-id: 4], CPU 3 [core-id: 6]
      qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future
      QEMU 2.12.0 monitor - type 'help' for more information
      (qemu) info numa
      4 nodes
      node 0 cpus: 0 1
      node 0 size: 0 MB
      node 0 plugged: 0 MB
      node 1 cpus: 2 3
      node 1 size: 256 MB
      node 1 plugged: 0 MB
      node 2 cpus:
      node 2 size: 0 MB
      node 2 plugged: 0 MB
      node 3 cpus:
      node 3 size: 256 MB
      node 3 plugged: 0 MB
      (qemu) info hotpluggable-cpus
      Hotpluggable CPUs:
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      CPUInstance Properties:
      node-id: "0"
      core-id: "6"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      CPUInstance Properties:
      node-id: "0"
      core-id: "4"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      qom_path: "/machine/unattached/device[1]"
      CPUInstance Properties:
      node-id: "1"
      core-id: "2"
      type: "host-spapr-cpu-core"
      vcpus_count: "2"
      qom_path: "/machine/unattached/device[0]"
      CPUInstance Properties:
      node-id: "0"
      core-id: "0"

      — Additional comment from Laurent Vivier on 2019-01-31 20:43:18 UTC —

      The following patch generates the good error message, but we can plug thread of a core on a different node than the other threads of the core:

      ----------------8<--------------------------------------------------------------------------------
      diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
      index 0942f35..679a93d 100644
      — a/hw/ppc/spapr.c
      +++ b/hw/ppc/spapr.c
      @@ -2405,15 +2405,13 @@ static void spapr_validate_node_memory(MachineState *machine, Error **errp)
      /* find cpu slot in machine->possible_cpus by core_id */
      static CPUArchId *spapr_find_cpu_slot(MachineState *ms, uint32_t id, int *idx)
      {

      • int index = id / smp_threads;
        -
      • if (index >= ms->possible_cpus->len) {
        + if (id >= ms->possible_cpus->len) { return NULL; }

        if (idx)

        { - *idx = index; + *idx = id; }
      • return &ms->possible_cpus->cpus[index];
        + return &ms->possible_cpus->cpus[id];
        }

      static void spapr_set_vsmt_mode(sPAPRMachineState *spapr, Error **errp)
      @@ -3796,14 +3794,9 @@ static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
      {
      int i;
      const char *core_type;

      • int spapr_max_cores = max_cpus / smp_threads;
      • MachineClass *mc = MACHINE_GET_CLASS(machine);
      • if (!mc->has_hotpluggable_cpus) { - spapr_max_cores = QEMU_ALIGN_UP(smp_cpus, smp_threads) / smp_threads; - }

        if (machine->possible_cpus)

        { - assert(machine->possible_cpus->len == spapr_max_cores); + assert(machine->possible_cpus->len == max_cpus); return machine->possible_cpus; }

      @@ -3814,16 +3807,18 @@ static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
      }

      machine->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +

      • sizeof(CPUArchId) * spapr_max_cores);
      • machine->possible_cpus->len = spapr_max_cores;
        + sizeof(CPUArchId) * max_cpus);
        + machine->possible_cpus->len = max_cpus;
        for (i = 0; i < machine->possible_cpus->len; i++) { - int core_id = i * smp_threads; - machine->possible_cpus->cpus[i].type = core_type; machine->possible_cpus->cpus[i].vcpus_count = smp_threads; - machine->possible_cpus->cpus[i].arch_id = core_id; + machine->possible_cpus->cpus[i].arch_id = i; + machine->possible_cpus->cpus[i].props.has_socket_id = true; + machine->possible_cpus->cpus[i].props.socket_id = i / (smp_threads * smp_cores); machine->possible_cpus->cpus[i].props.has_core_id = true; - machine->possible_cpus->cpus[i].props.core_id = core_id; + machine->possible_cpus->cpus[i].props.core_id = i / smp_threads; + machine->possible_cpus->cpus[i].props.has_thread_id = true; + machine->possible_cpus->cpus[i].props.thread_id = i % smp_threads; }

        return machine->possible_cpus;
        }
        ----------------8<--------------------------------------------------------------------------------

      $ qemu-system-ppc64 -smp 4,maxcpus=8,cores=2,threads=2,sockets=2 -numa node,cpus=0,cpus=1 -numa node,cpus=3,cpus=4 -numa node -numa node
      qemu-system-ppc64: warning: CPU(s) not present in any NUMA nodes: CPU 2 [socket-id: 0, core-id: 1, thread-id: 0], CPU 5 [socket-id: 1, core-id: 2, thread-id: 1], CPU 6 [socket-id: 1, core-id: 3, thread-id: 0], CPU 7 [socket-id: 1, core-id: 3, thread-id: 1]
      qemu-system-ppc64: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future

      But result of "info numa" and "info hotpluggable-cpus" don't seem correct.

      — Additional comment from Laurent Vivier on 2019-02-01 15:40:23 UTC —

      (In reply to Laurent Vivier from comment #5)
      > It is interesting to note we also lost the error message about the
      > invalidity of the command line:
      >
      > "Invalid node-id=1 of thread[cpu-index: 3] on CPU[core-id: 2, node-id: 4],
      > node-id must be the same"
      >

      This error checking has been removed by:

      commit 722387e78daf6a330220082934cfaaf68fa4d492
      Author: Igor Mammedov <imammedo@redhat.com>
      Date: Wed May 10 13:29:53 2017 +0200

      spapr: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()

      it's safe to remove thread node_id != core node_id error
      branch as machine_set_cpu_numa_node() also does mismatch
      check and is called even before any CPU is created.

      Signed-off-by: Igor Mammedov <imammedo@redhat.com>
      Acked-by: David Gibson <david@gibson.dropbear.id.au>
      Message-Id: <1494415802-227633-10-git-send-email-imammedo@redhat.com>
      Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

      And it seems machine_set_cpu_numa_node() doesn't check that.

      — Additional comment from Laurent Vivier on 2019-02-13 10:59:01 UTC —

      Andrea,

      after discussion upstream, we plan to deprecate the "-numa node,cpus=XX" parameter as the "-node cpu" is a more accurate way to place the CPUs in the NUMA topology.

      Does libvirt use "-numa node,cpus=XX" or "-node cpu"?

      — Additional comment from Andrea Bolognani on 2019-02-13 12:24:46 UTC —

      (In reply to Laurent Vivier from comment #8)
      > Andrea,
      >
      > after discussion upstream, we plan to deprecate the "-numa node,cpus=XX"
      > parameter as the "-node cpu" is a more accurate way to place the CPUs in the
      > NUMA topology.
      >
      > Does libvirt use "-numa node,cpus=XX" or "-node cpu"?

      I assume you meant "-numa cpu" instead of "-node cpu", since the
      latter doesn't seem to exist. Either way, libvirt exclusively uses
      the "-numa node" syntax at the moment.

      — Additional comment from Laurent Vivier on 2019-02-13 13:54:04 UTC —

      (In reply to Andrea Bolognani from comment #9)
      > (In reply to Laurent Vivier from comment #8)
      > > Andrea,
      > >
      > > after discussion upstream, we plan to deprecate the "-numa node,cpus=XX"
      > > parameter as the "-node cpu" is a more accurate way to place the CPUs in the
      > > NUMA topology.
      > >
      > > Does libvirt use "-numa node,cpus=XX" or "-node cpu"?
      >
      > I assume you meant "-numa cpu" instead of "-node cpu", since the

      Yes, "-numa node,cpus=XX" and "-numa cpu,node-id=YY".

      > latter doesn't seem to exist. Either way, libvirt exclusively uses
      > the "-numa node" syntax at the moment.

      So, I guess this means we can't remove the support of "-numa node,cpus=XX" soon.

      Is there a problem if we want to deprecate this syntax?

      — Additional comment from Andrea Bolognani on 2019-02-13 15:08:08 UTC —

      (In reply to Laurent Vivier from comment #10)
      > (In reply to Andrea Bolognani from comment #9)
      > > Either way, libvirt exclusively uses
      > > the "-numa node" syntax at the moment.
      >
      > So, I guess this means we can't remove the support of "-numa node,cpus=XX"
      > soon.
      >
      > Is there a problem if we want to deprecate this syntax?

      Not as long as someone implements support for '-numa cpu' support
      in libvirt first, I guess

      — Additional comment from Laurent Vivier on 2019-02-14 14:02:15 UTC —

      I'm going to close this BZ as NOTABUG because:

      1- pseries machine uses "cpus=" to specify cores on the command line (pc specifies threads),
      this why the error message is different

      2- on pseries machine type, maxcpus is the total number of cores (on pc it's the total number of threads)

      3- we plan to deprecate the "-numa node,cpus=XX" parameter, the "-numa cpu,node-id=YY" will be preferred

      4- libvirt uses "-numa node,cpus=XX" but it uses it correctly

      For more details, you can read comments from David and Igor on the patch series I sent to use the number of threads instead of the number of cores:

      [RFC,0/4] numa, spapr: add thread-id in the possible_cpus list
      https://patchwork.kernel.org/cover/1080882

      [RFC,1/4] numa, spapr: add thread-id in the possible_cpus list
      [RFC,2/4] numa: exit on incomplete CPU mapping
      [RFC,3/4] numa: move cpu_slot_to_string() upper in the function
      [RFC,4/4] numa: check threads of the same core are on the same node

      — Additional comment from Min Deng on 2019-02-15 05:22:41 UTC —

      > 3- we plan to deprecate the "-numa node,cpus=XX" parameter, the "-numa
      > cpu,node-id=YY" will be preferred
      > 4- libvirt uses "-numa node,cpus=XX" but it uses it correctly

      Hi Laurent,
      Thanks for the information,I have two concerns about bug.
      According to 3,I think that QE should use "-numa cpu,node-id=YY" instead of "-numa node,cpus=XX" in the test plan.But you also said "-numa node,cpus=XX" was used correctly in libvirt.Could you please explain more about it ? Thanks.

      Best Regards
      Min

      — Additional comment from Laurent Vivier on 2019-02-15 08:00:23 UTC —

      (In reply to Min Deng from comment #13)
      > > 3- we plan to deprecate the "-numa node,cpus=XX" parameter, the "-numa
      > > cpu,node-id=YY" will be preferred
      > > 4- libvirt uses "-numa node,cpus=XX" but it uses it correctly
      >
      > Hi Laurent,
      > Thanks for the information,I have two concerns about bug.
      > According to 3,I think that QE should use "-numa cpu,node-id=YY" instead
      > of "-numa node,cpus=XX" in the test plan.But you also said "-numa

      Yes

      > node,cpus=XX" was used correctly in libvirt.Could you please explain more
      > about it ? Thanks.

      /usr/libexec/qemu-kvm must be used through libvirt, and libvirt doesn't provide an incomplete NUMA config, so there is no warning and maxcpus is correctly set.
      In fact we should open a BZ to ask libvirt to use "-numa cpu" instead of "-numa node,cpus=".

      — Additional comment from Min Deng on 2019-02-15 10:00:40 UTC —

      Hi Dan
      As Laurent gave some comments about usage for how to configure cpu and numa for a guest from libvirt level,would you please have a look on comment14 ?

      Thanks
      Min

      — Additional comment from Laurent Vivier on 2019-02-15 11:26:31 UTC —

      Just a note:

      on pseries, core-id are multiple of threads by core.

      If we have 2 threads / core and maxcpus=24, we will have 12 core-ids with values: 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22.
      But the "cpus" values in "info numa" will be 0, 1, 2, 3, ... because "cpus" is core-id + thread-id.

      For instance, the command line (from a question from Gu Nini):

      ... -smp 14,maxcpus=24,cores=4,threads=2,sockets=3 \
      -numa node,nodeid=0 -numa node,nodeid=1 -numa node,nodeid=2 \
      -numa cpu,node-id=0,core-id=0 -numa cpu,node-id=1,core-id=2 -numa cpu,node-id=2,core-id=4

      will generate:

      (qemu) info numa
      3 nodes
      node 0 cpus: 0 1 6 7 8 9 10 11 12 13
      node 0 size: 0 MB
      node 0 plugged: 0 MB
      node 1 cpus: 2 3
      node 1 size: 256 MB
      node 1 plugged: 0 MB
      node 2 cpus: 4 5
      node 2 size: 256 MB
      node 2 plugged: 0 MB

      core-id 0 (2 threads) is on node 0 (cpus 0 and 1 -> core-id 0 + thread-id 0, core-id 0 + thread-id 1)
      core-id 2 (2 threads) is on node 1 (cpus 2 and 3 -> core-id 2 + thread-id 0, core-id 2 + thread-id 1)
      core-id 4 (2 threads) is on node 2 (cpus 4 and 5 -> core-id 4 + thread-id 0, core-id 4 + thread-id 1)

      And all other cores are put on node 0 by default (because of the incomplete NUMA config)

              virt-maint virt-maint
              rhn-support-dzheng Dan Zheng
              virt-maint virt-maint
              Han Han Han Han
              Votes:
              0 Vote for this issue
              Watchers:
              22 Start watching this issue

                Created:
                Updated: