-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Rollout Update works fine using the template available in the docs. However, when Kueue label is added, pod Replicas get in a Pending state and update is not done on LWS pods.
Steps to reproduce:
- Install Kueue (Operator and Operand) and LWS (Operator and Operand)
- Add LeaderWorkerSet to Kueue Operand CR (make sure to also have Pods, Deployments and StatefulSets)
- Create a Resource Flavor, Cluster Queue, Namespace and Local Queue
- Apply a LWS template with rollout strategy. Ex:
... labels: kueue.x-k8s.io/queue-name: user-queue spec: rolloutStrategy: type: RollingUpdate rollingUpdateConfiguration: maxUnavailable: 2 maxSurge: 2 ...
- Wait for pods to be created successfully
- Change something on the template, so an update can happen. For example, in my template I've changed the time for 3600 -> something else (template is attached to this bug)
- Reapply the template and wait for the update to happen
Actual: Replica pods are getting stuck in a Pending state, Update is not completed.
Expected: Update should happen successfully. Replica pods should be created in order to help the update and, once it has finished, they should be deleted.
Ps: One thing observed is that, without Kueue label update runs successfully.
- relates to
-
OCPKUEUE-363 Feature Verification for LeaderWorkerSet
-
- In Progress
-