Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-68658

[4.20] NodeNetworkInterfaceDown alert fires for disabled NICs

XMLWordPrintable

    • CNV I/U Operators Sprint 276
    • Customer Reported
    • None

      Description of problem:

      NodeNetworkInterfaceDown alert fires for disabled NICs

      Version-Release number of selected component (if applicable):

      4.18

      How reproducible:

      Every time

      Steps to Reproduce:

      1. Disable NIC
      2. See alerts firing 

      Actual results:

      Alert fires when any unused NIC is disabled 

      Expected results:

      Alert may want to target the default OVN iface and/or veths? 

      Additional info:

      Internal slack thread: https://redhat-internal.slack.com/archives/C017V3R4M08/p1743716777926719

      Alert still firing for disabled interfaces:

      I'm still seeing this on my local cluster with CNV 4.18.11

      sh-5.1# ip link show | grep -i down
      5: eno3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
      6: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
      7: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      10: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      
      ⬢ [fmhirtz@toolbx TrueNasDemocraticCsiConfig]$ oc rsh -n openshift-monitoring alertmanager-main-0 \
      amtool alert query --alertmanager.url http://localhost:9093
      Alertname                    Starts At                Summary                                                                                  State   
      Watchdog                     2025-08-12 14:24:40 UTC  An alert that should always be firing to certify that Alertmanager is working properly.  active  
      PrometheusRemoteWriteBehind  2025-08-12 14:25:45 UTC  Prometheus remote write is behind.                                                       active  
      NodeNetworkInterfaceDown     2025-08-12 14:30:07 UTC  Network interfaces are down                                                              active  
      NodeNetworkInterfaceDown     2025-08-12 14:30:07 UTC  Network interfaces are down                                                              active  
      LowVirtControllersCount      2025-08-12 14:35:48 UTC  More than one virt-controller should be ready if more than one worker node.              active  
      LowVirtAPICount              2025-08-12 15:25:48 UTC  More than one virt-api should be running if more than one worker nodes exist.            active  

      Previous PRs:

      https://github.com/kubevirt/hyperconverged-cluster-operator/pull/3482

              alitman@redhat.com Aviv Litman
              jhopper@redhat.com Jenifer Abrams
              Harel Meir Harel Meir
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: