Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9229

Gracefull Shutdown improvements

XMLWordPrintable

    • Moderate
    • None
    • 3
    • OSDOCS Sprint 233, OSDOCS Sprint 234, OSDOCS Sprint 235, OSDOCS Sprint 237, OSDOCS Sprint 238, OSDOCS Sprint 236, OSDOCS Sprint 239, OSDOCS Sprint 241, OSDOCS Sprint 243
    • 9
    • Unspecified
    • N/A
    • Release Note Not Required

      Document URL:
      https://docs.openshift.com/container-platform/4.9/backup_and_restore/graceful-cluster-shutdown.html

      Section Number and Name:
      Shutting down the cluster, Point #2

      Describe the issue:
      We are currently advising the customers to run a for loop against all of the Openshift nodes and forcing a Shutdown without placing them in Schedulable at false and draining them.

      Suggestions for improvement:

      I would recommend the following approach, since it will place all of the nodes in Schedulable at false and will also drain all of the worker nodes:

      #2 Mark the nodes unschedulable before performing the pod evacuation.
      ```
      for node in $(oc get nodes -o jsonpath='

      {.items[*].metadata.name}'); do echo ${node} ; oc adm cordon ${node} ; done
      ```

      #3 Evacuate the pods using the following method:
      ```
      for node in $(oc get nodes -l node-role.kubernetes.io/worker -o jsonpath='{.items[*].metadata.name}

      '); do echo ${node} ; oc adm drain ${node} --delete-emptydir-data --ignore-daemonsets=true --timeout=15s ; done
      ```

      #4 Shut down all of the nodes in the cluster. You can do this from your cloud provider’s web console, or run the following loop:

      ```
      for node in $(oc get nodes -o jsonpath='

      {.items[*].metadata.name}

      '); do oc debug node/${node} – chroot /host shutdown -h 1 ; done
      ```

      Additional information:

      I think this approach will provide a better outcome, since it will ensure the nodes are set to unschedulable, therefore preventing any workload to be scheduled on them so when one node is being shutdown, Kubernetes won't try to schedule workload on a node that's about to be shutdown.
      I also think this could have a positive outcome on ETCD.
      Thanks

              rhn-support-nalhadef Neal Alhadeff (Inactive)
              rhn-support-fisantos Filipe Santos
              Min Li Min Li
              Latha Sreenivasa Murthy Latha Sreenivasa Murthy
              Red Hat Employee
              Sunil Choudhary
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: