-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.14.z, 4.15.0
-
Moderate
-
No
-
Hypershift Sprint 243
-
1
-
Proposed
-
False
-
-
Release Note Not Required
-
In Progress
Description of problem:
The HyperShift Operator does not guarantee that two request serving nodes will be labeled with the HCP's namespace-name. It is likely that it labels the nodes initially and then doesn't notice if the nodes get deleted by something else.
Version-Release number of selected component (if applicable):
How reproducible:
100%
Steps to Reproduce:
1. Create a HCP with dedicated request serving nodes 2. Delete one of the request serving nodes (via deleting the node directly or its machine) 3. Observe that the replacement node does not have the required label for scheduling its request-serving pods
Actual results:
HCP's can exist without two nodes labeled with the HCP's name, causing the kube-apiserver pods to be unschedulable
❯ k get no -lhypershift.openshift.io/cluster=ocm-staging-26ljge23ub1112ve884u0opvkj2c4lpc-perf-rhcp-0012 NAME STATUS ROLES AGE VERSION ip-10-0-34-188.us-east-2.compute.internal Ready worker 9h v1.27.6+1648878
❯ k get po -n ocm-staging-26ljge23ub1112ve884u0opvkj2c4lpc-perf-rhcp-0012 -lapp=kube-apiserver -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-apiserver-54854bcb7-v88dq 0/5 Pending 0 151m <none> <none> <none> <none> kube-apiserver-54854bcb7-x5jqt 5/5 Running 0 3h2m 10.128.236.6 ip-10-0-34-188.us-east-2.compute.internal <none> <none>
Expected results:
Every HCP has two nodes labeled with the HCP's name
❯ k get po -n ocm-staging-26ljip0ck3d2i1bejp2sipio4okhgttn-perf-rhcp-0017 -l app=kube-apiserver -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-apiserver-5f85cd4b-l57qr 5/5 Running 0 169m 10.128.218.6 ip-10-0-114-35.us-east-2.compute.internal <none> <none> kube-apiserver-5f85cd4b-lqfsx 5/5 Running 0 169m 10.128.129.6 ip-10-0-59-232.us-east-2.compute.internal <none> <none>
❯ k get no -lhypershift.openshift.io/cluster=ocm-staging-26ljip0ck3d2i1bejp2sipio4okhgttn-perf-rhcp-0017 NAME STATUS ROLES AGE VERSION ip-10-0-114-35.us-east-2.compute.internal Ready worker 24h v1.27.6+1648878 ip-10-0-59-232.us-east-2.compute.internal Ready worker 5d2h v1.27.6+1648878
Additional info: