Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Blocker
Fix Version/s: DWO 0.19.0, 3.5.0.GA
Affects Version/s: 3.4.0.GA
Component/s: Team B: DevWorkspace + Operator, Web Terminal + Operator, editors/IDEs + built-in vscode extensions, Universal Developer Image, machine-exec, dev environment config
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

When undergoing performance testing we noticed that even though our cluster had Machine Autoscaler configured, we were never able to bring the autoscaler operator to spin up new Machines on demand.

This is because whenever we filled the cluster capacity ( this was with about 170 Workspace instances), the next Workspace failed to start up (because of cluster full capacity) but the Dev Spaces operator immediately marked it as failed and then stopped it. Part of the "stop" routine is to set Deployment replicas 1->0. This means that the Workspace pod is never kept in the "pending" status, which is required for Autoscaler to work.

Prerequisites (if any, like setup, operators/versions):

MachineAutoscaler configured on your cluster

Steps to Reproduce

Create as many (N) workspaces as required to exhaust the cluster capacity.

Actual results:

When Nth workspace fails to get created (due to full cluster capacity), the operator stops the deployment and sets replicas from 1 to 0, thus Auto scaler never kicks in.

Expected results:

When Nth workspace fails to get created (due to full cluster capacity), the operator keeps the Workspace pod in Pending phase which allows AutoScaler to kick in.

Reproducibility (Always/Intermittent/Only Once):

Always - as soon as the Cluster capacity is exhausted, we are no longer able to create any Workspace pods even though we have Autoscaler configured for our cluster.

Build Details:

Dev Spaces 3.4

Additional info (Such as Logs, Screenshots, etc):

We see this error in the UI

Which matches following OCP events (noticed the scale down to 0 which prevents autoscaler to work):

We also tried following workaround with no success:

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

error.png
107 kB
2023/02/27 2:58 PM
devworkspace-config.png
70 kB
2023/02/27 2:58 PM
unschedulable.png
146 kB
2023/02/27 2:58 PM
image-2023-03-14-15-09-06-804.png
444 kB
2023/03/14 2:09 PM
pending.png
86 kB
2023/03/14 2:09 PM
autoscaler.png
99 kB
2023/03/14 2:10 PM

Assignee:: Mario Loriedo

Reporter:: Anton Giertli

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2023/02/27 2:57 PM

Updated:: 2023/03/21 4:22 PM

Resolved:: 2023/03/01 3:28 PM

Details

Description

Description of problem:

Prerequisites (if any, like setup, operators/versions):

Steps to Reproduce

Actual results:

Expected results:

Reproducibility (Always/Intermittent/Only Once):

Build Details:

Additional info (Such as Logs, Screenshots, etc):

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates