Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-2650

Test Coverage: Multiple Service monitor issues.

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • OVN
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      ( ) The test coverage is aligned with the epic's acceptance criteria

      Given an OVN deployment with service monitors configured for load balancer backends (TCP and UDP),

      When backend servers change state,

      Then service monitor status is reported accurately within the configured wait_time without excessive pinctrl wake-ups, lost notifications, or premature health check packets.

      Show
      ( ) The test coverage is aligned with the epic's acceptance criteria Given an OVN deployment with service monitors configured for load balancer backends (TCP and UDP), When backend servers change state, Then service monitor status is reported accurately within the configured wait_time without excessive pinctrl wake-ups, lost notifications, or premature health check packets.
    • rhel-9
    • None
    • rhel-net-ovn
    • ssg_networking

      This task is tracking the test case writing activities to cover the bug described below.
      There are multiple issues handing service monitor, mostly related to properly waking up pinctrl or ovn-controller main thread,.

      pinctrl wakes up too often: when handling service monitor, pinctrl ends up sometimes looping/polling (multiple "wakeup due to 0-ms timeout at controller/pinctrl.c:8191")
      Even when n_failure_count is set to 1, tcp service monitor is not always reported offline immediately after the expected wait_time, but only next time pinctrl is woken up, for any other reasons, adding random delays to reporting offline status. Same for udp service monitor going online.
      Sometimes notification from pinctrl thread to main ovn-controller thread is "lost" and ovn-controller is not properly woken up. Another event is necessary to wake up ovn-controller, potentially delaying service monitor status by up to 30 seconds. This is due to how seq_read/seq_wait is used.
      Sometimes health check packets are lost because such packets are sent before some (e.g. load balancer related) flows are installed, delaying service reported online until next packets are sent.
      Most of those issues are service monitor related only.

      ovn-controller not waking up might affect other pinctrl related services.

              ovnteam@redhat.com OVN Team
              rh-ee-sfaye Stanislas Faye
              OVN QE OVN QE
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: