Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4608

Startup Probe not honored in 4.8 since #RHBZ2095387

    XMLWordPrintable

Details

    • Bug
    • Resolution: Can't Do
    • Normal
    • None
    • 4.10.z
    • Node / Kubelet
    • Important
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      On Baremetal cluster with virtualizes control plane.
      Customer deploys Springboot workloads with requests and limits (requests != limits).
      
      When rebooting a node (config/upgrade) , since #RHBZ2095387, crio does not crash anymore but startup probes seems to be not honored.
      Pods are starting all a the same time and startup timeout is not honored.
      
      This results in pods killed before they started and be restarted and the high load of the some leads to workloads not starting of in a huge timeframe. 

      Version-Release number of selected component (if applicable):

      4.10

      How reproducible:

      Drain a node

      Steps to Reproduce:

      1. Drain a node
      2. Another node take the pods and issue occurs
      

      Actual results:

      500 pods star in ~ 2 days 

      Expected results:

      500 pods start in a reasonable time like before 4.8+

      Additional info:

      Also using TSP with 2 zones and anti affinity rules.

      Attachments

        Issue Links

          Activity

            People

              rphillip@redhat.com Ryan Phillips
              rh-support-sbelmasg Simon Belmas
              Sunil Choudhary Sunil Choudhary
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: