Loading...

XML

Word

Printable

Type: Task
Resolution: Done
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: OVN
Labels:
- OVN-QE-Test-Coverage

Blocked:
False
Ready:
False
Epic Link:
Multiple Service monitor issues.
Acceptance Criteria:

Hide

( ) The test coverage is aligned with the epic's acceptance criteria

Given an OVN deployment with service monitors configured for load balancer backends (TCP and UDP),

When backend servers change state,

Then service monitor status is reported accurately within the configured wait_time without excessive pinctrl wake-ups, lost notifications, or premature health check packets.

Show
( ) The test coverage is aligned with the epic's acceptance criteria Given an OVN deployment with service monitors configured for load balancer backends (TCP and UDP), When backend servers change state, Then service monitor status is reported accurately within the configured wait_time without excessive pinctrl wake-ups, lost notifications, or premature health check packets.
OS:
rhel-9
Planning:
None
AssignedTeam:
rhel-net-ovn
Intelligence Requested:
Market:
Sub-System Group:

ssg_networking

Sprint:
OVN FDP Sprint 13, OVN FDP Sprint 14
sprint_count:
2

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

This task is tracking the test case writing activities to cover the bug described below.
There are multiple issues handing service monitor, mostly related to properly waking up pinctrl or ovn-controller main thread,.

pinctrl wakes up too often: when handling service monitor, pinctrl ends up sometimes looping/polling (multiple "wakeup due to 0-ms timeout at controller/pinctrl.c:8191")
Even when n_failure_count is set to 1, tcp service monitor is not always reported offline immediately after the expected wait_time, but only next time pinctrl is woken up, for any other reasons, adding random delays to reporting offline status. Same for udp service monitor going online.
Sometimes notification from pinctrl thread to main ovn-controller thread is "lost" and ovn-controller is not properly woken up. Another event is necessary to wake up ovn-controller, potentially delaying service monitor status by up to 30 seconds. This is due to how seq_read/seq_wait is used.
Sometimes health check packets are lost because such packets are sent before some (e.g. load balancer related) flows are installed, delaying service reported online until next packets are sent.
Most of those issues are service monitor related only.

ovn-controller not waking up might affect other pinctrl related services.

Assignee:: OVN Team

Reporter:: Stanislas Faye

QA Contact:: OVN QE

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/11/12 9:46 AM

Updated:: 2026/01/06 3:58 PM

Resolved:: 2026/01/06 3:58 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates