-
Task
-
Resolution: Done
-
Critical
-
None
-
None
-
None
-
None
-
5
-
False
-
Blocked by https://issues.redhat.com/browse/HIVE-1976
-
True
-
Yes
-
MGDSRVS-346 - Autoscaling of OpenShift Streams Data Plane
-
MK - Sprint 224
WHAT
We'll be using OCM clusters autoscaling concept as part of our dynamic scaling epic in https://issues.redhat.com/browse/MGDSTRM-6543. The criteria used are the ones described in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics
WHY
Enable ocm cluster autoscaling for the Production cluster to autoscale nodes
HOW
The how part can be done as sub-tasks
- create a ticket in https://issues.redhat.com/projects/SDB/issues to have the autoscale capability enabled for the organization ids are: `The external id is 14410147 `` and the id is *1qQpRlvmFaaOmFRxs1cyxUCPRP3. The ticket is similar to https://issues.redhat.com/browse/SDB-2847*
- Get confirmation from BU that we can enable autoscaling for Prod env
- To make sure that once autoscaling is enabled for the Prod OSD cluster, the maximum number of nodes is set to the current number of nodes. This is to ensure that we are not going to expand the cluster beyond the current number of nodes that support Prod workload. This will need to be done in collaboration with CS-SRE
- When the cluster autoscaler is turned on, make sure that `balanceSimilarNodeGroups` is to 'true' in the default "ClusterAutoscaler" object created by hive. (see https://issues.redhat.com/browse/HIVE-1976 for a RFE for HIVE to do this automatically)
- Add an SOP on how autoscaling can be done for new Prod clusters (This is related to https://issues.redhat.com/browse/MGDSTRM-9233)
DONE
Include the following where applicable:
BU confirmation obtained that we can enable autoscaling for Prod OSD clusterwe have BU approval. This was posted in https://issues.redhat.com/browse/MGDSTRM-7998- The `balanceSimilarNodeGroups` is to 'true' in the default "ClusterAutoscaler" object created by HIVE.
* SOP on how to turn autoscaling on in the cluster is also createdBU approval for the increased creation time for "standard" instances in prod granted. We would like to keep the creation time of "trial/developer" instances as is for now, to ensure better experience. Once we have the "reserved capacity" feature in place, we can also enable autoscaling on "trial/developer" OSDs.
- is blocked by
-
HIVE-1976 Tuning "BalanceSimilarNodeGroups" to true by default when creating the "default" ClusterAutoscaler
- Closed
-
MGDSTRM-8901 Investigating and testing that No (impacting) warnings when autoscaler scales OSD worker nodes beyond certain capacity
- Closed