Uploaded image for project: 'MicroShift'
  1. MicroShift
  2. USHIFT-1301

Ensure all pods have good probes

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Won't Do
    • Icon: Normal Normal
    • None
    • None
    • None
    • None
    • ensure good probes
    • In Progress
    • Quality / Stability / Reliability
    • OCPSTRAT-1134MicroShift robustness Initiative
    • 0% To Do, 0% In Progress, 100% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None

      Epic Goal

      Technical Debt:

      • k8s probes (startup, live/ready) are vital for resilience, to detect and remedy problematic operational states
      • The goal of this epic is to ensure all MicroShift pods have the necessary probes, matching k8s best practises for probes

      Why is this important?

      • Missing/Wrong probes can lead to missing failures, frequent crashes/restarts of pods

      Acceptance Criteria

      • All pods/containers have at least ready/health probes, with good configurations regarding timeouts etc.
      • Probes are following best practises:
        • cheap and easy - no expensive bash commands 
        • quick - probes should respond <0.5 seconds
        • side effect free - probes should not modify the state 
        • .... 

      Previous Work (Optional):

      1. I did just a few samples and found pods/containers with either no probes at all (e.g. topolvm-node) or probes which smell expensive / with side effects (e.g. ovnkube-master pod)

      Open questions::

      Done Checklist

      • Do a spike to review ALL pods/containers for their probes. Create Tasks to fix them
      • Provide dev doc on best practises to adhere to
      • Fix missing/bad probes
      • No CI needed, No external Docs needed, No QE needed.

              Unassigned Unassigned
              dfroehli42rh Daniel Fröhlich
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: