Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-38663

clusteroperator/kube-scheduler blips Degraded=True during upgrade test

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.18
    • kube-scheduler
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          In an effort to ensure all HA components are not degraded by design during normal e2e test or upgrades, we are collecting all operators that are blipping Degraded=True during any payload job run.
      
      This card captures kube-scheduler operator that blips Degraded=True during upgrade runs.
      
      
      Example Job: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-azure-ovn-upgrade/1843275894844559360
        
      Reasons associated with the blip: NodeController_MasterNodesReady
      
      For now, we put an exception in the test. But it is expected that teams take action to fix those and remove the exceptions after the fix go in.
      
      See linked issue for more explanation on the effort.

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

      Found a new reason, job example:

      : [Monitor:legacy-cvo-invariants][bz-kube-scheduler] clusteroperator/kube-scheduler should not change condition/Degraded expand_less2h2m5s{  2 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:
      
      Nov 20 07:12:46.091 E clusteroperator/kube-scheduler condition/Degraded reason/GuardController_SyncError::NodeController_MasterNodesReady status/True GuardControllerDegraded: Unable to apply pod openshift-kube-scheduler-guard-ip-10-0-121-171.us-east-2.compute.internal changes: Operation cannot be fulfilled on pods "openshift-kube-scheduler-guard-ip-10-0-121-171.us-east-2.compute.internal": the object has been modified; please apply your changes to the latest version and try again\nNodeControllerDegraded: The master nodes not ready: node "ip-10-0-121-171.us-east-2.compute.internal" not ready since 2025-11-20 07:12:24 +0000 UTC because KubeletNotReady (container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?)
      Nov 20 07:12:46.091 - 2s    E clusteroperator/kube-scheduler condition/Degraded reason/GuardController_SyncError::NodeController_MasterNodesReady status/True GuardControllerDegraded: Unable to apply pod openshift-kube-scheduler-guard-ip-10-0-121-171.us-east-2.compute.internal changes: Operation cannot be fulfilled on pods "openshift-kube-scheduler-guard-ip-10-0-121-171.us-east-2.compute.internal": the object has been modified; please apply your changes to the latest version and try again\nNodeControllerDegraded: The master nodes not ready: node "ip-10-0-121-171.us-east-2.compute.internal" not ready since 2025-11-20 07:12:24 +0000 UTC because KubeletNotReady (container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?)
      
      7 unwelcome but acceptable clusteroperator state transitions during e2e test run.  These should not happen, but because they are tied to exceptions, the fact that they did happen is not sufficient to cause this test-case to fail:
      
      Nov 20 07:06:07.804 E clusteroperator/kube-scheduler condition/Degraded reason/NodeController_MasterNodesReady status/True NodeControllerDegraded: The master nodes not ready: node "ip-10-0-120-40.us-east-2.compute.internal" not ready since 2025-11-20 07:06:05 +0000 UTC because KubeletNotReady (container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?) (exception: https://issues.redhat.com/browse/OCPBUGS-38663)
      Nov 20 07:06:07.804 - 24s   E clusteroperator/kube-scheduler condition/Degraded reason/NodeController_MasterNodesReady status/True NodeControllerDegraded: The master nodes not ready: node "ip-10-0-120-40.us-east-2.compute.internal" not ready since 2025-11-20 07:06:05 +0000 UTC because KubeletNotReady (container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?) (exception: https://issues.redhat.com/browse/OCPBUGS-38663)
      Nov 20 07:06:32.501 W clusteroperator/kube-scheduler condition/Degraded reason/AsExpected status/False NodeControllerDegraded: All master nodes are ready (exception: Degraded=False is the happy case)
      Nov 20 07:12:48.697 W clusteroperator/kube-scheduler condition/Degraded reason/AsExpected status/False NodeControllerDegraded: All master nodes are ready\nGuardControllerDegraded: Unable to apply pod openshift-kube-scheduler-guard-ip-10-0-121-171.us-east-2.compute.internal changes: Operation cannot be fulfilled on pods "openshift-kube-scheduler-guard-ip-10-0-121-171.us-east-2.compute.internal": the object has been modified; please apply your changes to the latest version and try again (exception: Degraded=False is the happy case)
      Nov 20 07:18:03.320 E clusteroperator/kube-scheduler condition/Degraded reason/NodeController_MasterNodesReady status/True NodeControllerDegraded: The master nodes not ready: node "ip-10-0-7-90.us-east-2.compute.internal" not ready since 2025-11-20 07:17:57 +0000 UTC because KubeletNotReady (container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?) (exception: https://issues.redhat.com/browse/OCPBUGS-38663)
      Nov 20 07:18:03.320 - 17s   E clusteroperator/kube-scheduler condition/Degraded reason/NodeController_MasterNodesReady status/True NodeControllerDegraded: The master nodes not ready: node "ip-10-0-7-90.us-east-2.compute.internal" not ready since 2025-11-20 07:17:57 +0000 UTC because KubeletNotReady (container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?) (exception: https://issues.redhat.com/browse/OCPBUGS-38663)
      Nov 20 07:18:20.883 W clusteroperator/kube-scheduler condition/Degraded reason/AsExpected status/False NodeControllerDegraded: All master nodes are ready (exception: Degraded=False is the happy case)
      }    

       

       

              aos-workloads-staff Workloads Team Bot Account
              kenzhang@redhat.com Ken Zhang
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: