-
Bug
-
Resolution: Done
-
Undefined
-
4.14
-
None
-
Important
-
No
-
False
-
Description of problem:
After scaling 6->2 nodes succesfully, trying to scale back 2->6 nodes ends with 3 nodes only, although there are still 3 agents to allocate
Version-Release number of selected component (if applicable):
[kni@ocp-edge77 ~]$ oc version Client Version: 4.14.0-0.nightly-2023-12-16-101212 Kustomize Version: v5.0.1 [kni@ocp-edge77 ~]$ oc get hc -A NAMESPACE NAME VERSION KUBECONFIG PROGRESS AVAILABLE PROGRESSING MESSAGE clusters hosted-0 4.14.7 hosted-0-admin-kubeconfig Completed True False The hosted control plane is available
How reproducible:
happens sometimes
Steps to Reproduce:
1.deploye a hub cluster + hosted cluster with 6 agent providers workers. I've used this job for deploying :https://auto-jenkins-csb-kniqe.apps.ocp-c1.prod.psi.redhat.com/job/CI/job/job-runner/2111/console 2.scale down 6->2 nodes by : (.venv) [kni@ocp-edge77 ocp-edge-auto_cluster]$ oc scale nodepool/hosted-0 --namespace clusters --kubeconfig ~/clusterconfigs/auth/hub-kubeconfig --replicas=2 nodepool.hypershift.openshift.io/hosted-0 scaled 3. wait for 2 nodes in the nodepool : (oc get nodepool -n clusters) 4. scale up 2->6 nodes by : (.venv) [kni@ocp-edge77 ocp-edge-auto_cluster]$ oc scale nodepool/hosted-0 --namespace clusters --kubeconfig ~/clusterconfigs/auth/hub-kubeconfig --replicas=6 nodepool.hypershift.openshift.io/hosted-0 scaled 5. recreate the Ignition token by applying the workaround described here :https://github.com/stolostron/rhacm-docs/blob/2.9_stage/clusters/release_notes/known_issues.adoc#on-bare-metal-platforms-agent-resources-might-fail-to-pull-ignition 5.1 oc delete secret agent-user-data-hosted-0-d3a75541 -n clusters-hosted-0 5.2 add a lable to one of the agents, by editing it: oc edit agent 030ae455-fe57-4029-9d73-c7fb8356b34f -n hosted-0 --kubeconfig ~/clusterconfigs/auth/hub-kubeconfig
Actual results:
[kni@ocp-edge77 ~]$ oc get nodepool -n clusters NAME CLUSTER DESIRED NODES CURRENT NODES AUTOSCALING AUTOREPAIR VERSION UPDATINGVERSION UPDATINGCONFIG MESSAGE hosted-0 hosted-0 6 3 False 4.14.7 Minimum availability requires 6 replicas, current 3 available [kni@ocp-edge77 ~]$ oc get agents -n hosted-0 NAME CLUSTER APPROVED ROLE STAGE 030ae455-fe57-4029-9d73-c7fb8356b34f hosted-0 true worker Done 0a398b9f-90ab-44e5-8175-54f53b201089 hosted-0 true worker Done 34b60917-7df2-46d8-96c7-455a797eec3c hosted-0 true worker Done 5bc7ff28-70c8-4d22-8b3e-747e4c7e69a7 true worker 91074b73-56ec-4df2-83a8-5eaea2983771 true worker a9116103-87eb-4205-aeb3-f38ad466efb4 true worker [kni@ocp-edge77 ~]$
Expected results:
[kni@ocp-edge77 ~]$ oc get nodepool -n clusters NAME CLUSTER DESIRED NODES CURRENT NODES AUTOSCALING AUTOREPAIR VERSION UPDATINGVERSION UPDATINGCONFIG MESSAGE hosted-0 hosted-0 6 6 False 4.14.7 all 6 agents are allocated to hosted-0, and not only 3 of them [kni@ocp-edge77 ~]$ oc get agents -n hosted-0 NAME CLUSTER APPROVED ROLE STAGE 030ae455-fe57-4029-9d73-c7fb8356b34f hosted-0 true worker Done 0a398b9f-90ab-44e5-8175-54f53b201089 hosted-0 true worker Done 34b60917-7df2-46d8-96c7-455a797eec3c hosted-0 true worker Done 5bc7ff28-70c8-4d22-8b3e-747e4c7e69a7 true worker 91074b73-56ec-4df2-83a8-5eaea2983771 true worker a9116103-87eb-4205-aeb3-f38ad466efb4 true worker [kni@ocp-edge77 ~]$
Additional info:
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update