Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-4128

Allow usage of debug pod when node pod capacity is full

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • None
    • Node
    • False
    • None
    • False
    • Not Selected

      1. Proposed title of this feature request
      Allow usage of debug pod (oc debug node/nodename) when node pod capacity is full

      2. What is the nature and description of the request?
      Customer having a node with a capacity of  i.e 120  pods (as set through kubeletconfig.spec.kubeletConfig.maxPods):

      Capacity:
        attachable-volumes-azure-disk:  32
        cpu:                            32
        ephemeral-storage:              785896428Ki
        hugepages-1Gi:                  0
        hugepages-2Mi:                  0
        memory:                         264123100Ki
        pods:                           120 <---------
      Allocatable:
        attachable-volumes-azure-disk:  32
        cpu:                            31850m
        ephemeral-storage:              763446302735
        hugepages-1Gi:                  0
        hugepages-2Mi:                  0
        memory:                         250491612Ki
        pods:                           120 <-------
      ..
      
      
      Non-terminated Pods:                      (120 in total)
      

      Consequently if node reaches it's maximum capacity, it's not possible to spawn a debug pod and will get below error:

      oc debug node/nodename
      Temporary namespace openshift-debug-dj2j5 is created for debugging node...
      Starting pod/xxxxx-debug ...
      To use host binaries, run `chroot /host`
      
      Removing debug pod ...
      Temporary namespace openshift-debug-dj2j5 was removed.
      Error from server (BadRequest): container "container-00" in pod "xxxxx-debug" is not available
      

      3. Why does the customer need this? (List the business requirements here)

      The "120" allocatable pod on the node is set to match the /25 node subnet that can only allocate around that number of IPs. However debug pods are started using host-network and do not consume such IP.
      
      As they run a lot of pods on our nodes, changing this (for example setting maxPods to "121") would not work as very quickly other pods will get scheduled on the node, making it impossible to run debug pods again.
      
      Customer  question is therefore, is there a way to spawn debug pods even when the node pod capacity is reached? 
      

      4. List any affected packages or components.

      oc debug

              gausingh@redhat.com Gaurav Singh
              rhn-support-psingour Poornima Singour
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: