• BU Product Work
    • 3
    • False
    • None
    • False
    • If docs needed, set a value
    • Unset
    • ?
    • ?
    • ?
    • ?
    • Untriaged
    • Not Supported
    • ---
    • SDN Sprint 232
    • 0
    • Untriaged

      Create a severity warning alert to alert to admin that there is packet loss occurring due to failed ovs vswitchd lookups. This may occur if vswitchd is cpu constrained and there are also numerous lookups.

      Use metric  ovs_vswitchd_netlink_overflow which shows netlink messages dropped by the vswitchd daemon due to buffer overflow in userspace.

      For the kernel equivalent, use metric ovs_vswitchd_dp_flows_lookup_lost . Both metrics usually have the same value but may differ if vswitchd may restart.

      Both these metrics should be aggregate into a single alert if the value has increased recently.

       

      DoD: QE test case, code merged to CNO, metrics document updated ( https://docs.google.com/document/d/1lItYV0tTt5-ivX77izb1KuzN9S8-7YgO9ndlhATaVUg/edit )

            [CORENET-2713] [alert] Netlink overflow

            Qiong Wang added a comment -

            QE test cases for the two alerts:

            https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-60705 

            https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-60706 

            There is no Accepted 4.13 nightly image contains the alerts till now, test them on 4.13.0-0.ci-2023-03-16-074638 and passed.

            Qiong Wang added a comment - QE test cases for the two alerts: https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-60705   https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-60706   There is no Accepted 4.13 nightly image contains the alerts till now, test them on 4.13.0-0.ci-2023-03-16-074638 and passed.

            Looks like this just merged post 4.13 branching, so I will need a backport to 4.13. 

            Martin Kennelly added a comment - Looks like this just merged post 4.13 branching, so I will need a backport to 4.13. 

            Merged. Ready for QE.

            Martin Kennelly added a comment - Merged. Ready for QE.

            Martin Kennelly added a comment - PR: https://github.com/openshift/cluster-network-operator/pull/1630  

              mkennell@redhat.com Martin Kennelly
              mkennell@redhat.com Martin Kennelly
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: