Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: rhos-17.1.z
Affects Version/s: None
Component/s: openstack-neutron
Labels:
- TestOnly
- Triaged

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Bugzilla Bug:
RHBZ: 2262654
Regression:
None
Intelligence Requested:
Market:
Target Version:

rhos-17.1.z

Severity:
Informational

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:
When a controller node crash is triggered SNAT IP first moves to a running controller and then moves back from a running controller (not crashed) to a running controller (crashed). We expect a low amount of packets lost in both situations, but seems that when the IP move back there are several packets lost.

Example:

crash of a controller node at 02/02/2024 10:46:14 CET
-> from a laptop that can reach SNAP IP assigned to controller node:
64 bytes from 10.x.10.198: icmp_seq=2091 ttl=252 time=22.0 ms
3 packets lost
64 bytes from 10.x.10.198: icmp_seq=2095 ttl=252 time=27.6 ms
-> from the instance (No FIP) to external IP:
64 bytes from 10.x.11.254: icmp_seq=1245 ttl=63 time=1.09 ms
3 packets lost
64 bytes from 10.x.11.254: icmp_seq=1249 ttl=63 time=2.69 ms

So in case of a crash trigger, we can see 3 packets lost, not so bad.

controller node return UP 02/02/2024 10:54:56 CET
-> from a laptop that can reach SNAP IP assigned to controller node:
64 bytes from 10.x.10.198: icmp_seq=2602 ttl=252 time=22.1 ms
19 packets lost
64 bytes from 10.x.10.198: icmp_seq=2622 ttl=252 time=21.7 ms
....
64 bytes from 10.x.10.198: icmp_seq=2647 ttl=252 time=21.6 ms
6 packets lost
64 bytes from 10.x.10.198: icmp_seq=2655 ttl=252 time=22.8 ms
-> from the instance (No FIP) to external IP:
64 bytes from 10.x.11.254: icmp_seq=1755 ttl=63 time=1.14 ms
20 packets lost
64 bytes from 10.x.11.254: icmp_seq=1776 ttl=63 time=4.26 ms

When the node comes back we can see more than 20 packets lost and in case of SNAT IP seems happened two times

Version-Release number of selected component (if applicable):
Red Hat Openstack 17.1 (RHOSP17.1)

Steps to Reproduce:
1. trigger controller crash with `echo c > /proc/sysrq-trigger`
2. start pinging the VM an external IP or from host external to RHOSP the SNAT IP
3. When the controller nodes to come up we can see several ping lost in specific interval.

Actual results:
we can see ping lost for some seconds.

Expected results:
1 to 3 ping lost.

Additional info:

depends on

FDP-441 CR-LRP port flips flops after BFD failover due to unexpected chassis failure

To Do

external trackers

Red Hat Customer Portal 03720275

Red Hat Issue Tracker FDP-441

Assignee:: Miro Tomaska

Reporter:: RH Bugzilla Integration

QA Contact:: Eran Kuris

Team:: rhos-dfg-networking-squad-neutron

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2024/02/04 3:31 PM

Updated:: 2025/02/07 9:17 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty