-
Feature
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
Feature Overview:
Too big of a node, too many nodes, etc. Similar to ROS for RHEL.
Scope and what’s possible TBD, needs conversation with Kruize and AppLearner teams.
- Define default utilization level for warning (e. g. low = 20% and high = 80%)
- Allow users to define their own default utilization levels warning
- Allow users to define their on utilization levels warning for the nodes of this or that cluster
- Allow users to define their on utilization levels warning for control plane nodes vs worker plane nodes
- Allow users to define specific utilization levels warning per instance size (e. g. 70% of small worker node size might be critical but 90% of a large worker node might be OK)
- Recommend "according to X time observation, we think you can downsize your cluster from X nodes to Y nodes" (where X > Y) or viceversa
- Look at unscheduled jobs
- Look at average, peak, valley, time over high warning level, time under low warning level, etc
- Let users define what instance types they can use. Typically organizations standardize (or get better deals) on some instance types.
- Let users define on what cloud regions they can operate and how many they can have at minimum/maximum. Typically organizations standardize on some regions because of compliance, proximity, etc.
Docs:
Yes
SMEs and Stakeholders:
Name | Role |
PM | |
Eng Manager | |
UX | |