-
Story
-
Resolution: Done
-
Undefined
-
None
-
None
We can create a test case that covers this case:
Tested these scenarios:
- All control plane elements on infra node by adding node select and toleration to smcp:
spec: runtime: defaults: pod: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
frherrer@frherrer-mac repos % kubectl get pods --all-namespaces --field-selector spec.nodeName=fhocp412-v9l5b-worker-0-z6h8l NAMESPACE NAME READY STATUS RESTARTS AGE istio-system grafana-dc59b476-d9qkp 2/2 Running 0 119s istio-system istio-egressgateway-8578fbf664-rg5z9 1/1 Running 0 2m istio-system istio-ingressgateway-b6dbb8659-prc8s 1/1 Running 0 2m istio-system istiod-basic-6f88d8ddf4-bvjt6 1/1 Running 0 2m16s istio-system jaeger-6b4dfc4b68-w2nk4 2/2 Running 0 118s istio-system kiali-8bf8dd6f-c7vc2 1/1 Running 0 82s istio-system prometheus-5768879748-qc2dd 2/2 Running 0 2m4s openshift-cluster-csi-drivers openstack-cinder-csi-driver-node-zkkvj 3/3 Running 0 31h openshift-cluster-node-tuning-operator tuned-9djcx 1/1 Running 0 31h openshift-dns node-resolver-mzz6s 1/1 Running 0 31h openshift-image-registry node-ca-45gcr 1/1 Running 0 31h openshift-machine-config-operator machine-config-daemon-qwsmn 2/2 Running 0 31h openshift-monitoring node-exporter-lchmz 2/2 Running 0 31h openshift-multus multus-additional-cni-plugins-kkkpv 1/1 Running 0 31h openshift-multus multus-f8ndk 1/1 Running 0 31h openshift-multus network-metrics-daemon-l72b5 2/2 Running 0 31h openshift-network-diagnostics network-check-target-mz62w 1/1 Running 0 31h openshift-openstack-infra coredns-fhocp412-v9l5b-worker-0-z6h8l 2/2 Running 0 31h openshift-openstack-infra keepalived-fhocp412-v9l5b-worker-0-z6h8l 2/2 Running 0 31h openshift-operators istio-cni-node-v2-3-rdvpl 1/1 Running 0 30h openshift-operators istio-operator-7c5b49f4cb-7qx5c 1/1 Running 0 5h23m openshift-sdn sdn-5gz5g 2/2 Running 0 31h
- Istiod pod added tolerations and node selector:
runtime: components: pilot: pod: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
frherrer@frherrer-mac repos % kubectl get pods --all-namespaces --field-selector spec.nodeName=fhocp412-v9l5b-worker-0-z6h8l NAMESPACE NAME READY STATUS RESTARTS AGE istio-system istiod-basic-84dff849c-l688t 1/1 Running 0 12s openshift-cluster-csi-drivers openstack-cinder-csi-driver-node-zkkvj 3/3 Running 0 31h
- For kiali:
spec: runtime: components: kiali: pod: nodeSelector: node-role.kubernetes.io/infra: '' tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
frherrer@frherrer-mac repos % kubectl get pods --all-namespaces --field-selector spec.nodeName=fhocp412-v9l5b-worker-0-z6h8l NAMESPACE NAME READY STATUS RESTARTS AGE istio-system kiali-f8596b4c9-snvwz 0/1 Running 0 9s openshift-cluster-csi-drivers openstack-cinder-csi-driver-node-zkkvj 3/3 Running 0 31h
- For Jaeger:
spec: runtime: components: jaeger: pod: nodeSelector: node-role.kubernetes.io/infra: '' tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
frherrer@frherrer-mac repos % kubectl get pods --all-namespaces --field-selector spec.nodeName=fhocp412-v9l5b-worker-0-z6h8l NAMESPACE NAME READY STATUS RESTARTS AGE istio-system jaeger-6b4dfc4b68-8fhn5 2/2 Running 0 17s openshift-cluster-csi-drivers openstack-cinder-csi-driver-node-zkkvj 3/3 Running 0 32h
- For prometheus:
spec: runtime: components: prometheus: pod: nodeSelector: node-role.kubernetes.io/infra: '' tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
frherrer@frherrer-mac repos % kubectl get pods --all-namespaces --field-selector spec.nodeName=fhocp412-v9l5b-worker-0-z6h8l NAMESPACE NAME READY STATUS RESTARTS AGE istio-system prometheus-5768879748-nhggf 2/2 Running 0 2m2s openshift-cluster-csi-drivers openstack-cinder-csi-driver-node-zkkvj 3/3 Running 0 32h
- For grafana:
spec: runtime: components: grafana: pod: nodeSelector: node-role.kubernetes.io/infra: '' tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
frherrer@frherrer-mac repos % kubectl get pods --all-namespaces --field-selector spec.nodeName=fhocp412-v9l5b-worker-0-z6h8l NAMESPACE NAME READY STATUS RESTARTS AGE istio-system grafana-dc59b476-6sgkc 2/2 Running 0 6s openshift-cluster-csi-drivers openstack-cinder-csi-driver-node-zkkvj 3/3 Running 0 32h
- Ingress and egress gateway:
spec: gateways: ingress: runtime: pod: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved egress: runtime: pod: nodeSelector: node-role.kubernetes.io/infra: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/infra value: reserved - effect: NoExecute key: node-role.kubernetes.io/infra value: reserved
frherrer@frherrer-mac repos % kubectl get pods --all-namespaces --field-selector spec.nodeName=fhocp412-v9l5b-worker-0-z6h8l NAMESPACE NAME READY STATUS RESTARTS AGE istio-system istio-egressgateway-8578fbf664-mg526 1/1 Running 0 19s istio-system istio-ingressgateway-b6dbb8659-l5jz4 1/1 Running 0 19s
Aditional test:
- Added node selector and toleration with errors:
- Expected behavior accomplished: the pod is not created because tolerations, node selector is not met and the old pod is still running not affecting any service
- Added spec.runtime.components.nonexistentcomponent the istio operator does nothing and everything stills works
Tested that after each pod move the component still works (Jaeger, prometheus, grafana, etc. UI accessible and displaying data). Moving the Task to Release Pending