Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-61735

Openshift 4.19 cluster breaks after the cluster has been powered off for some hours and powered on again.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Critical Critical
    • None
    • 4.19.z
    • kube-apiserver
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • Yes
    • x86_64
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Powering off a recently deployed 4.19.9 cluster using Assisted Installer for 24h or more, breaks the cluster. Bootstraper CSR are created but once they are approved, KAS stopped trusting the kubelet-client certificates rendering the cluster unusable. 

      Version-Release number of selected component (if applicable):

      4.19.19

      How reproducible:

      Deploying a fresh 4.19.9 multi node cluster using assisted installer. Once the cluster is deployed, power off. After 24h or more, bring the cluster up again. 

      Steps to Reproduce:

          1. Create a basic multi node cluster using 4.19.9 version.
          2. Wait for the cluster to be deployed and power nodes off.
          3. After 24h or more, bring up the cluster
      
      Symptoms are the following:
      - Csr are created right away but not approved (all nodes).
      - After approving all the Pending CSR, contact is lost with the cluster. No oc logs, oc rsh since it throws tls error. 
      - KAS starts to reject connections from kubelet since it doesn't trust kubelet certificate anymore. For this reason kubelet-server-current.pem csr is never issued. 
      - Cluster stops working.
          

      Actual results:

      Cluster stops working.

      Expected results:

      Cluster should be working with no issues. 

      Additional info:

          

              Unassigned Unassigned
              rhn-gps-alfredo Alfredo Pizarro
              None
              None
              Ke Wang Ke Wang
              None
              Votes:
              6 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: