Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-34817

Infrastructure node workloads "taints" are ambiguous in example and description

XMLWordPrintable

    • No
    • False
    • Hide

      None

      Show
      None

      In https://docs.openshift.com/container-platform/4.14/machine_management/creating-infrastructure-machinesets.html#binding-infra-node-workloads-using-taints-tolerations_creating-infrastructure-machinesets

      The following text and examples are ambiguous. They go back and forth between taints ` node-role.kubernetes.io/infra:NoSchedule` and `node-role.kubernetes.io/infra=reserved:NoExecute`. The text that explains the example that applies taint `reserved:NoExecute` is `This example places a taint on node1 that has key node-role.kubernetes.io/infra and taint effect NoSchedule.` but it doesn't match the example, as it talks about `NoSchedule`. Readers will not know which of the 2 taints they should actually apply

      Add a taint to the infra node to prevent scheduling user workloads on it:

      Determine if the node has the taint:

      $ oc describe nodes <node_name>

      Sample output

      oc describe node ci-ln-iyhx092-f76d1-nvdfm-worker-b-wln2l
      Name: ci-ln-iyhx092-f76d1-nvdfm-worker-b-wln2l
      Roles: worker
      ...
      Taints: node-role.kubernetes.io/infra:NoSchedule
      ...

      This example shows that the node has a taint. You can proceed with adding a toleration to your pod in the next step.

      If you have not configured a taint to prevent scheduling user workloads on it:

      $ oc adm taint nodes <node_name> <key>=<value>:<effect>

      For example:

      $ oc adm taint nodes node1 node-role.kubernetes.io/infra=reserved:NoExecute

      You can alternatively apply the following YAML to add the taint:

      kind: Node
      apiVersion: v1
      metadata:
      name: <node_name>
      labels:
      ...
      spec:
      taints:

      • key: node-role.kubernetes.io/infra
        effect: NoExecute
        value: reserved
        ...

      This example places a taint on node1 that has key node-role.kubernetes.io/infra and taint effect NoSchedule. Nodes with the NoSchedule effect schedule only pods that tolerate the taint, but allow existing pods to remain scheduled on the node.

      If a descheduler is used, pods violating node taints could be evicted from the cluster.

      Add tolerations for the pod configurations you want to schedule on the infra node, like router, registry, and monitoring workloads. Add the following code to the Pod object specification:

      tolerations:

      • effect: NoExecute
        key: node-role.kubernetes.io/infra
        operator: Exists
        value: reserved

      It's important to note that NoExecute taints break `oc debug` pods, unless further workarounds are applied:
      https://access.redhat.com/solutions/4976641

      So in order to avoid going down some configuration rabbit hole, I'd recommend sticking to the NoSchedule taint everywhere for infra nodes.
      And if users are converting from a normal worker node to an infra node, instruct users to manually remove pods that they do not want on the current node after making it an infra node.

              ocp-docs-bot OCP DocsBot
              akaris@redhat.com Andreas Karis
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: