Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: openshift-4.14
Component/s: Cluster Infrastructure
Labels:

Work Type:
Improvement
Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

1. Proposed title of this feature request

Allow resource NodeAffinity to trigger creation of a suitable node, if one doesn't already exist in the cluster.

2. What is the nature and description of the request?

Assume an Openshift cluster has multiple machinesets, and machine autoscalers, in place. One machineset could be for 'regular' AMD VMs, the second machineset could be for Spot instances. You could also have further machinesets for GPU VMs etc.

Imagine that currently the cluster only has 'regular' AMD VMs in place. There are no spot instances, yet.
Now, if a Resource (e.g. deployment) is deployed with a NodeAffinity for a spot instance (as it is a workload that can be interrupted) using "requiredDuringSchedulingIgnoredDuringExecution", the pods will remain in Pending mode, as there are no Spot instances (that match the label selector) available in the cluster to satisfy that NodeAffinity.

Wouldn't it make more sense that the cluster is aware that it currently doesn't have any spot instances to be able to schedule this workload, but knows that if it scales up one of the spot instance machinesets, it will be able to schedule this workload onto that new spot node, once it becomes "ready"?

3. Why does the customer need this?

we want to encourage developers to put interruptible workloads onto Spot instances in order to reduce cloud compute costs.

spot instances, by their very nature, can be removed from the cluster at any moment, so we can't have one running at all times in the cluster, even if we wanted to. In GCP, it will certainly be removed at least once every 24 hours.

We would like that the cluster automatically reacts to the needs of the workloads without any interaction from the Openshift operations team.

4. List any affected packages or components.

Assignee:: Subin M

Reporter:: Archisman Dey

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/08/12 1:28 PM

Updated:: 2024/11/12 2:17 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates