-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
4.11
The OpenShift autoscaler does not trigger a scale-up for a MachineAutoscaler with "minReplicas: 0" for Pods that define ephemeral-storage requests.
So given the following MachineAutoscaler / MachineSet combination:
```
apiVersion: "autoscaling.openshift.io/v1beta1"
kind: "MachineAutoscaler"
metadata:
name: sbb-worker-highmem-3-0
namespace: "openshift-machine-api"
spec:
minReplicas: 0
maxReplicas: 3
scaleTargetRef:
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
name: sbb-worker-highmem-3-0
```
```
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
labels:
machine.openshift.io/cluster-api-cluster: eap01p-2jsxv
name: sbb-worker-highmem-3-0
namespace: openshift-machine-api
spec:
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: eap01p-2jsxv
machine.openshift.io/cluster-api-machineset: sbb-worker-highmem-3-0
template:
metadata:
labels:
machine.openshift.io/cluster-api-cluster: eap01p-2jsxv
machine.openshift.io/cluster-api-machine-role: highmem
machine.openshift.io/cluster-api-machine-type: highmem
machine.openshift.io/cluster-api-machineset: sbb-worker-highmem-3-0
spec:
metadata:
labels:
node-role.kubernetes.io/app: ""
node-role.kubernetes.io/highmem: ""
providerSpec:
value:
apiVersion: machine.openshift.io/v1beta1
credentialsSecret:
name: azure-cloud-credentials
namespace: openshift-machine-api
image:
offer: ""
publisher: ""
resourceID: /resourceGroups/eap01p-2jsxv-rg/providers/Microsoft.Compute/images/eap01p-2jsxv
sku: ""
version: ""
kind: AzureMachineProviderSpec
location: westeurope
managedIdentity: eap01p-2jsxv-identity
networkResourceGroup: sbb-prod-eap-ocp-network
osDisk:
cachingType: ReadOnly
diskSettings:
ephemeralStorageLocation: Local
diskSizeGB: 400
managedDisk:
storageAccountType: Standard_LRS
osType: Linux
publicIP: false
publicLoadBalancer: eap01p-2jsxv
resourceGroup: eap01p-2jsxv-rg
securityProfile:
encryptionAtHost: true
subnet: sbb-prod-eap-ocp-private-sub
userDataSecret:
name: worker-user-data
vmSize: Standard_E16as_v4
vnet: sbb-prod-eap-ocp-vnet
zone: "3"
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/highmem
```
Trying to start a pod that specifies ephemeral-storage requests...
```
...
resources:
requests:
memory: "64Mi"
ephemeral-storage: "256Mi"
cpu: "250m"```
...
```
...fails with the following message from the autoscaler:
```
pod didn't trigger scale-up: 3 Insufficient "ephemeral-storage"
```
The problem itself is nothing special, the same holds true for CPU and memory when scaling from zero. Nevertheless, the current autoscaler implementation supports the "machine.openshift.io/memoryMb" and "machine.openshift.io/vCPU" annotations that help the autoscaler to determine which MachineSet to scale. But no such annotation seems to exist for ephemeral storage.
What options are there to trigger the scale-up of an autoscaled MachineSet with minReplicas 0, that currently has 0 active replicas, for a pod that specifies ephemeral-storage requests? The expected behavior would be that an annotation can be added to the MachineSet to infor the autoscaler about the amount of ephemeral storage on nodes created from the MachineSet. Always keeping a node running in the MachineSet is not an option.
- relates to
-
OCPBUGS-13541 pod with GPU request, volume assigned and nodeSelector applied is failing to trigger OpenShift Container Platform 4 - Node scale-up
- Closed
-
OCPBUGS-14074 Pods with GPU request are failing to get scheduled since autoscaler is not scaling up the respective nodes
- Closed
- links to