Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-2260

KubePodNotReady - Increase Tolerance During Master Node Restarts

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • None
    • 4.13, 4.12
    • Monitoring
    • Important
    • Approved
    • False
    • Hide

      Reducing the alerting noise during upgrades is important enough that we want this to be fixed before GA.

       

      Show
      Reducing the alerting noise during upgrades is important enough that we want this to be fixed before GA.  
    • Hide
      Previously, the Kubernetes scheduler could skip scheduling certain pods for a node that received multiple restart operations. The {product-title} {product-version} counteracts this issue by including the `KubePodNotScheduled` alert for pods that cannot be scheduled within 30 minutes. (link:https://issues.redhat.com/browse/OCPBUGS-2260[*OCPBUGS-2260*])
      Show
      Previously, the Kubernetes scheduler could skip scheduling certain pods for a node that received multiple restart operations. The {product-title} {product-version} counteracts this issue by including the `KubePodNotScheduled` alert for pods that cannot be scheduled within 30 minutes. (link: https://issues.redhat.com/browse/OCPBUGS-2260 [* OCPBUGS-2260 *])
    • Bug Fix
    • Done

    Description

      TRT-594 investigates failed CI upgrade runs due to alert KubePodNotReady firing.  The case was a pod getting skipped over for scheduling over two successive master node update / restarts.  The case was determined valid so the ask is to be able to have the monitoring aware that master nodes are restarting and scheduling may be delayed.   Presuming we don't want to change the existing tolerance for the non master node restart cases could we suppress it during those restarts and fall back to a second alert with increased tolerances only during those restarts, if we have metrics indicating we are restarting.  Or similar if there are better ways to handle.

      The scenario is:

      • A master node (1) is out of service during upgrade
      • A pod (A) is created but can not be scheduled due to anti-affinity rules as the other nodes already host a pod of that definition
      • A second pod (B) from the same definition is created after the first
      • Pod (A) attempts scheduling but fails as the master (1) node is still updating
      • Master (1) node completes updating
      • Pod (B) attempts scheduling and succeeds
      • Next Master (2) node begins updating
      • Pod (A) can not be scheduled on the next attempt(s) as the active master nodes already have pods placed and the next master (2) node is unavailable
      • Master (2) node completes updating
      • Pod (A) is scheduled

      Attachments

        Issue Links

          Activity

            People

              dfitzmau@redhat.com Darragh Fitzmaurice
              rh-ee-fbabcock Forrest Babcock
              Junqi Zhao Junqi Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: