Uploaded image for project: 'Product Technical Learning'
  1. Product Technical Learning
  2. PTL-9898

Single Node OpenShift cert renewal deadlock

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • en-US (English)

      Issue description

      After the lab environment images are over 90 days old, the node certificate for the kube-apiserver must be renewed.  In some classrooms, this renewal does not properly finish, leaving the cluster completely unusable.  The symptoms of this issue are the kube-apiserver operator getting stuck upgrading or the authentication service throwing TLS errors.

      Steps to reproduce:

      Start a lab in one of the affected courses.

      Workaround:

      Force the cert renewal using the following command (ssh to utility VM from workstation and then run it on utility):

      ssh lab@utility
      oc get secret -A -o json | jq -r '.items[] | select(.metadata.annotations."auth.openshift.io/certificate-not-after" | .!=null and fromdateiso8601<='$( date --date='+1year' +%s )') | "-n \(.metadata.namespace) \(.metadata.name)"' | xargs -n3 oc patch secret -p='{"metadata": {"annotations": {"auth.openshift.io/certificate-not-after": null}}}' 

      Expected result:
      Within 5 minutes, the API server will restart with the correctly renewed certificate, and the cluster will be healthy again.  Use the wait.sh script on utility to verify this.

            rht-rallred Richard Allred
            rht-rallred Richard Allred
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: