Description of the Problem
During the ROSA or OSD cluster creation via wizard, the user has initially selected multizone availability from the "Cluster settings > Details" step and went to the "Default machine pool" step. User went back to the "Cluster settings > Details" step and changed the availability zone as a single zone.User has proceeded further and tried to install the cluster and eventually failed due to backend error that indicated as below.
CLUSTERS-MGMT-400: The maximum worker nodes allowed by the cluster autoscaler '250' exceeds the maximum allowed '249'. Reduce the maximum worker nodes allowed by the cluster autoscaler to be within the maximum allowed.
After investigating further , It was identified that the cluster autoscaling "Max-node-total" value didn't reset the definition based on the cluster version as well as availability zone from wizard step if any change in original selection.This was exposed after the fix OCMUI-2796
PFA screen recording Screen Recording 2024-12-17 at 4.49.11 PM.mov
Hence the recommendation is that the cluster autoscaling "Max-node-total" value should reset to the default definition in case of any change in original selection of cluster version or cluster availability zone.
Additionally it should also notify the user that the max-node-total" value would reset based on the above. Ex: an info note around the "Edit cluster autoscaling settings" would help here.
How reproducible:
Always
Steps to Reproduce:
- Open OCM UI staging
- Launch a ROSA classic or OSD(CCS) wizard.
- Reach to cluster settings > Details step.
- Select the availability zone as "Multizone" and fill all other required fields.
- Click Next button.
- Enable "Enable autoscaling" checkbox.
- Click "Back" button.
- Change the availability zone as "Single zone".
- Fill all required definition in each step and reach to "Review and create" step.
- Click "Create cluster" and view the behavior.
- Repeat all above steps by replacing step 4 and step 8 as below.
Step 4 : Select the cluster version >= 4.14.14
Step 8: Change the cluster version < 4.14.14
Actual results:
The cluster autoscaling "max-node-total" default value or limits are depends upon the availability zone as well as cluster version.
In both above situation i.e. During a change in the initial value of availability zone or cluster version, the cluster autoscaling "max-node-total" value didn't reset based on the requirement. Because of the same reason, the wizard submission blocked with an error from backend.
Expected results:
The cluster autoscaling "max-node-total" default value should reset based on the latest changes on cluster availability zone and cluster version values.
User should also notify with a note or tooltip text in wizard that, any change in cluster availability zone and cluster version will be reset the max-node-total value.