Loading...

Type: Bug
Resolution: Done
Priority: Minor
Fix Version/s: rhos-18.0.0
Affects Version/s: rhos-18.0.0
Component/s: openstack-nova
Labels:
- Patch
- Triaged

Story Points:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Fixed in Build:
openstack-nova-27.1.1-18.0.20231114154650.a869ab1.el9ost
Regression:
None
Release Note Type:
Release Note Not Required
Intelligence Requested:
Market:

Severity:
Low

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

+++ This bug was initially created as a clone of Bug #2107306 +++

Description of problem:
Instance creation fails onto NUMATopologyFilter when it seems there's at least 1 numa node with enough ressources.

Version-Release number of selected component (if applicable):
Red Hat OpenStack Platform release 16.2.2 (Train)

How reproducible:
Every time we try to spawn a instance using this flavor.

Steps to Reproduce:
1. Try to create a VM using the flavor.
2.
3.

Actual results:
Being able to create a VM within a numa node with available ressource.

Expected results:
Creation gets block at NUMATopologyFilter.

Additional info:
[stack@director ]$ openstack flavor show ovn-dpdk
--------------------------------------------------------------------------------------------------------------------------------------------------+

Field

Value

--------------------------------------------------------------------------------------------------------------------------------------------------+

OS-FLV-DISABLED:disabled	False
OS-FLV-EXT-DATA:ephemeral	0
access_project_ids	None
description	None
disk	20
extra_specs	{'hw:cpu_policy': 'dedicated', 'hw:emulator_threads_policy': 'isolate', 'hw:mem_page_size': '1GB', 'ovn-dpdk': 'true'}

name	ovn-dpdk
os-flavor-access:is_public	True
properties	hw:cpu_policy='dedicated', hw:emulator_threads_policy='isolate', hw:mem_page_size='1GB', ovn-dpdk='true'
ram	4096
rxtx_factor	1.0
swap	0
vcpus	4

--------------------------------------------------------------------------------------------------------------------------------------------------+

— Additional comment from Jean-Francois Beaudoin on 2022-07-14 17:37:22 UTC —

— Additional comment from Jean-Francois Beaudoin on 2022-07-14 18:39:03 UTC —

Here is the command used to create the instance:

openstack server create --image rhel-server-7.9.raw --boot-from-volume 20 --nic net-id=1830fb1a-d1ce-4477-a237-aa934636670c --config-drive true rh_server_raw_004 --flavor ovn-dpdk
And here is the network

(centro) [stack@mxcnnfv3clarhospdir scripts]$ openstack network show 1830fb1a-d1ce-4477-a237-aa934636670c
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Field

Value

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

admin_state_up	UP
availability_zone_hints
availability_zones
created_at	2022-07-06T01:00:05Z
description
dns_domain
id	1830fb1a-d1ce-4477-a237-aa934636670c
ipv4_address_scope	None
ipv6_address_scope	None
is_default	False
is_vlan_transparent	None
location	cloud='', project.domain_id=, project.domain_name='Default', project.id='5b9adfa304bd4f51a702bfff5a10ed8b', project.name='admin', region_name='regionOne', zone=
mtu	9000
name	EOM2741_N1
port_security_enabled	True
project_id	5b9adfa304bd4f51a702bfff5a10ed8b
provider:network_type	vlan
provider:physical_network	ovsdpdk_numa1
provider:segmentation_id	2741
qos_policy_id	None
revision_number	2
router:external	External
segments	None
shared	True
status	ACTIVE
subnets	224d0b1d-8c6b-415e-bcdd-c74dd3d44131
tags
updated_at	2022-07-06T01:00:10Z

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

— Additional comment from Jean-Francois Beaudoin on 2022-07-14 19:31:44 UTC —

podman exec -it $(sudo podman ps -q -f name=galera) mysql nova -BNe "select numa_topology from compute_nodes where compute_nodes.host = 'mxcntulm01nfv3e1c02.mx.att.com'"

— Additional comment from on 2022-07-14 20:50:33 UTC —

// ESCALATION MANAGEMENT TEAM NOTE //

Justification for checking off the Customer Escalation Flag:

-The associated case with this bug, 03266134, is currently on the WatchList as there is increased attention and production impact.
-This was a new issue that came after addressing the initial issue in https://bugzilla.redhat.com/show_bug.cgi?id=2101924 and https://gss--c.visualforce.com/apex/Case_View?id=5006R00001mEd9r&sfdc.override=1

Background:
AT&T Mexico Telco Cloud NFV platform is not able to get into production. The problem is happening in the last stage of the project and damaging Red Hat's image and our message about taking Telco functions to virtual environments and our short-term goal of deploying CNFs for 5G in our flagship solution Openshift. This is heavily delayed and putting in jeopardy deals that can reach US$ 2.2M in 2022 and US$ 5M in 2023, among other projects already in the pipeline.

—

Harpreet Singh
Senior Escalation Manager

— Additional comment from Jean-Francois Beaudoin on 2022-07-14 21:25:08 UTC —

The issue is resolved, this is my understanding, correct me If I'm wrong;

There's 24cpu in numa1, with 19 already pinned, so only room for 1 VM(uses 5 cpu), which does work.

The network used is:
~~~

provider:physical_network

ovsdpdk_numa1
~~~
So this network will spawn VM into NUMA-1.

Since there's no reserved cpu for the host, with a empty node, we expect to be able to spawn 4VM.
We've also confirmed it, with a empty node , with same flavor/network, it's possible to create the expected 4VM.

So they'll need to use a network physnet requesting a different numa node if they want to spawn more instances on that compute.
This is the current one configured, seen from nova.conf on that compute;
~~~

List of physnets present on this host. For more information, refer to the
documentation. (list value)
#physnets =
physnets=ovsdpdk_numa0,ovsdpdk_numa1
~~~

— Additional comment from Artom Lifshitz on 2022-07-15 20:39:46 UTC —

(In reply to Jean-Francois Beaudoin from comment #5)
> The issue is resolved, this is my understanding, correct me If I'm wrong;
>
>
> There's 24cpu in numa1, with 19 already pinned, so only room for 1 VM(uses 5
> cpu), which does work.
>
> The network used is:
> ~~~
> | provider:physical_network | ovsdpdk_numa1
> ~~~
> So this network will spawn VM into NUMA-1.
>
>
> Since there's no reserved cpu for the host, with a empty node, we expect to
> be able to spawn 4VM.
> We've also confirmed it, with a empty node , with same flavor/network, it's
> possible to create the expected 4VM.
>
> So they'll need to use a network physnet requesting a different numa node if
> they want to spawn more instances on that compute.

Yep, or have a hw:numa_nodes=2 (or more) guest NUMA topology. As long as one of the guest NUMA nodes is on the same host NUMA node as the physnet, that'll pass scheduling, allowing the rest of the guest NUMA nodes to land on different host NUMA nodes. However, there's a risk of a performance drop because of the cross-NUMA communications.

> This is the current one configured, seen from nova.conf on that compute;
> ~~~
> # List of physnets present on this host. For more information, refer to the
> # documentation. (list value)
> #physnets =
> physnets=ovsdpdk_numa0,ovsdpdk_numa1
> ~~~

We'll leave this BZ open to track potentially improving the logging in that code, specifically in or around _numa_cells_support_network_metadata().

— Additional comment from on 2022-07-18 12:04:19 UTC —

updating the title to reject that this is being used to track improving logging.

tl;dr
the original bug report was invalid because the customer did not actually have enough space to boot all the vms they wanted on the host in question.
however while debugging this we noticed that _numa_cells_support_network_metadata does not have any logging so when it eliminates a host cell
because the numa aware switch feature is in use there is not log to indicate that. As such it makes debugging scheduling issues related to numa
aware vswitchs very difficult without intimate knowledge of the code. we can improve this trivially by adding logging at debug and or info level
when a cell is eliminated.

— Additional comment from RHEL Program Management on 2022-07-20 15:32:28 UTC —

The keyword FutureFeature has been added. If this bug is not a FutureFeature, please remove from the Summary field any strings containing "RFE, rfe, FutureFeature, FEAT, Feat, feat". Additionally, if this feature is being backported to a previous release, clone this BZ to older releases and add the "FeatureBackport" keyword only to those cloned BZs.

— Additional comment from Artom Lifshitz on 2022-10-04 19:24:03 UTC —

I'm going to convert this to a bug to improve logging in that area of the code, target 16.x because we'll need it for customer cases.

— Additional comment from Jorge San Emeterio on 2022-10-10 13:26:01 UTC —

Upstream bug at:
https://bugs.launchpad.net/nova/+bug/1751784

— Additional comment from Artom Lifshitz on 2022-12-19 17:36:11 UTC —

Moving this to 16.2 as no more maintenance releases are planned for 16.1.

— Additional comment from Artom Lifshitz on 2023-06-05 18:42:11 UTC —

I think aiming for 16.2.6 with this is realistic, given how small the patch is.

— Additional comment from RHEL Program Management on 2023-06-05 18:42:22 UTC —

This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag. For details, see https://mojo.redhat.com/docs/DOC-1195410

— Additional comment from Artom Lifshitz on 2023-07-17 16:14:49 UTC —

Moving to 17.1.2 as it's not yet merged on master, and 17.1.1 will be exceptions only, and we can't have regressions between 16.2.6 and 17.1.1.

— Additional comment from RHEL Program Management on 2023-07-17 16:15:01 UTC —

This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag.

— Additional comment from Ian Frangs on 2023-08-03 15:46:23 UTC —

If you think customers need a description of this bug in addition to the content of the BZ summary field, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require an additional Doc Text description, please set the 'requires_doc_text' flag to '-'.

If the BZ already has Doc Text, please perform a quick review. Update it if needed.

— Additional comment from RHEL Program Management on 2023-10-16 15:54:29 UTC —

This bugzilla has had its Target Release removed since it does not have a Target Milestone set (i.e. it has not been committed for a specific release).

— Additional comment from RHEL Program Management on 2023-11-14 01:45:37 UTC —

This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag.

external trackers

OpenStack gerrit 900841

Red Hat Engineering Gerrit 448133

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty