-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
8
-
False
-
-
True
-
-
-
5
-
HCIDOCS 2024#9, HCIDOCS 2024#10, HCIDOCS 2024#11
-
3
Task scope: Change node names to "node-0..n" because role-based node names (master/worker-0..n) causes confusion.
Section to fix: 12.6 Installing a primary control plane node on an unhealthy cluster.
Unhealthy control plane node (master-2) is replaced by healthy control plane node (node-6).
Step 1:
NAME STATUS ROLES AGE VERSION worker-1 Ready worker 20h v1.24.0+3882f8f master-2 NotReady master 20h v1.24.0+3882f8f master-3 Ready master 20h v1.24.0+3882f8f worker-4 Ready worker 20h v1.24.0+3882f8f master-5 Ready worker 15h v1.24.0+3882f8f
Rename the nodes (original control plane nodes are numbered 0-2). Put them first to make it easier to read.
NAME STATUS ROLES AGE VERSION node-0 Ready master 20h v1.24.0+3882f8f node-1 NotReady master 20h v1.24.0+3882f8f node-2 Ready master 20h v1.24.0+3882f8f node-3 Ready worker 20h v1.24.0+3882f8f node-4 Ready worker 15h v1.24.0+3882f8
Update node names in commands in subsequent steps.
"node-1" (NotReady) is removed. The new node will be "node-6":
NAME STATUS ROLES AGE VERSION node-0 Ready master 20h v1.24.0+3882f8f node-2 Ready master 20h v1.24.0+3882f8f node-3 Ready worker 20h v1.24.0+3882f8f node-4 Ready worker 15h v1.24.0+3882f8f node-6 Ready master 40m v1.24.0+3882f8f
12.5. Installing a primary control plane node on a healthy cluster
Healthy control plane node (master-0) is replaced by healthy control plane node (node-6).
Step 3:
NAME STATUS ROLES AGE VERSION master-0 Ready master 4h42m v1.24.0+3882f8f worker-1 Ready worker 4h29m v1.24.0+3882f8f master-2 Ready master 4h43m v1.24.0+3882f8f master-3 Ready master 4h27m v1.24.0+3882f8f worker-4 Ready worker 4h30m v1.24.0+3882f8f master-5 Ready master 105s v1.24.0+3882f8f
Rename the nodes, with control plane nodes first:
NAME STATUS ROLES AGE VERSION node-0 Ready master 4h42m v1.24.0+3882f8f node-1 Ready master 4h29m v1.24.0+3882f8f node-2 Ready master 4h43m v1.24.0+3882f8f node-3 Ready master 4h27m v1.24.0+3882f8f node-4 Ready worker 4h30m v1.24.0+3882f8f node-5 Ready worker 105s v1.24.0+3882f8f
Update commands in subsequent steps.
New node will be added. This will be "node-6".
Step 4: `$ bash link-machine-and-node.sh custom-master3 worker-5` <= I think worker-5 is going to be node-6.
Step 11.1: `$oc delete bmh -n openshift-machine-api custom-master3` <= Should this be "$ oc delete bmh -n openshift-machine-api node-0"?
Step 11.iv, after node is deleted (as currently documented). `master-0` (now node-0) is gone:
NAME STATUS ROLES AGE VERSION worker-1 Ready worker 19h v1.24.0+3882f8f master-2 Ready master 20h v1.24.0+3882f8f master-3 Ready master 19h v1.24.0+3882f8f worker-4 Ready worker 19h v1.24.0+3882f8f master-5 Ready master 15h v1.24.0+3882f8f
I think the output would probably look like this. node-0 (aka master-0) is gone and node-6 (master) is ready.
NAME STATUS ROLES AGE VERSION node-1 Ready master 4h42m v1.24.0+3882f8f node-2 Ready master 4h29m v1.24.0+3882f8f node-3 Ready master 4h43m v1.24.0+3882f8f node-4 Ready worker 4h27m v1.24.0+3882f8f node-5 Ready worker 4h30m v1.24.0+3882f8f node-6 Ready master 105s v1.24.0+3882f8f
Source: email from Oved Ourfali:
The names of the nodes do cause a major confusion there, abusing the master/worker names.
I think that in order to make it more clear, we should probably:
1. Have all node names as "node-X" (node-0/1/2/3/4/5/6).
2. When a new node gets added, it gets the next number (in our case we're adding a worker node that then becomes a master node, iiuc, so it should be called node-6).
3. nodes 0/1/2 are the original master nodes. In 12.6 one of those gets replaced with node-6. In 12.5 one gets added (node-6).In addition, we should have QE verify the steps to make sure those are right, but let's first get it more clear?
[1] https://docs.redhat.com/en/documentation/assisted_installer_for_openshift_container_platform/2024/html/installing_openshift_container_platform_with_the_assisted_installer/expanding-the-cluster#installing-primary-control-plane-node-healthy-cluster_expanding-the-cluster
[2] https://docs.redhat.com/en/documentation/assisted_installer_for_openshift_container_platform/2024/html/installing_openshift_container_platform_with_the_assisted_installer/expanding-the-cluster#installing-primary-control-plane-node-unhealthy-cluster_expanding-the-cluster
- relates to
-
HCIDOCS-555 The node names mentioned on 'Installing a primary control plane node on a healthy cluster' documentation looks incorrect.
- Review
-
HCIDOCS-522 Fix "Installing CP node on healthy cluster" procedure
- Review
-
HCIDOCS-523 Fix "Installing CP node on unhealthy cluster" procedure
- Review
-
HCIDOCS-517 Fix Bash script steps
- Review
-
HCIDOCS-518 Fix 'oc rsh' command blocks
- Review
- mentioned on