Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9229

Gracefull Shutdown improvements

    XMLWordPrintable

Details

    • Moderate
    • 3
    • OSDOCS Sprint 233, OSDOCS Sprint 234, OSDOCS Sprint 235, OSDOCS Sprint 237, OSDOCS Sprint 238, OSDOCS Sprint 236, OSDOCS Sprint 239, OSDOCS Sprint 241, OSDOCS Sprint 243
    • 9
    • Unspecified
    • N/A
    • Release Note Not Required

    Description

      Document URL:
      https://docs.openshift.com/container-platform/4.9/backup_and_restore/graceful-cluster-shutdown.html

      Section Number and Name:
      Shutting down the cluster, Point #2

      Describe the issue:
      We are currently advising the customers to run a for loop against all of the Openshift nodes and forcing a Shutdown without placing them in Schedulable at false and draining them.

      Suggestions for improvement:

      I would recommend the following approach, since it will place all of the nodes in Schedulable at false and will also drain all of the worker nodes:

      #2 Mark the nodes unschedulable before performing the pod evacuation.
      ```
      for node in $(oc get nodes -o jsonpath='

      {.items[*].metadata.name}'); do echo ${node} ; oc adm cordon ${node} ; done
      ```

      #3 Evacuate the pods using the following method:
      ```
      for node in $(oc get nodes -l node-role.kubernetes.io/worker -o jsonpath='{.items[*].metadata.name}

      '); do echo ${node} ; oc adm drain ${node} --delete-emptydir-data --ignore-daemonsets=true --timeout=15s ; done
      ```

      #4 Shut down all of the nodes in the cluster. You can do this from your cloud provider’s web console, or run the following loop:

      ```
      for node in $(oc get nodes -o jsonpath='

      {.items[*].metadata.name}

      '); do oc debug node/${node} – chroot /host shutdown -h 1 ; done
      ```

      Additional information:

      I think this approach will provide a better outcome, since it will ensure the nodes are set to unschedulable, therefore preventing any workload to be scheduled on them so when one node is being shutdown, Kubernetes won't try to schedule workload on a node that's about to be shutdown.
      I also think this could have a positive outcome on ETCD.
      Thanks

      Attachments

        Activity

          People

            rhn-support-nalhadef Neal Alhadeff
            rhn-support-fisantos Filipe Santos
            Min Li Min Li
            Latha Sreenivasa Murthy Latha Sreenivasa Murthy
            Red Hat Employee
            Sunil Choudhary
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: