-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.20
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
OCP Node Kueue Sprint 283
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Rollout Update works fine using the template available in the docs. However, when Kueue label is added, pod Replicas get in a Pending state and update is not done on LWS pods.
Steps to reproduce:
- Install Kueue (Operator and Operand) and LWS (Operator and Operand)
- Add LeaderWorkerSet to Kueue Operand CR (make sure to also have Pods, Deployments and StatefulSets)
- Create a Resource Flavor, Cluster Queue, Namespace and Local Queue
- Apply a LWS template with rollout strategy. Ex:
... labels: kueue.x-k8s.io/queue-name: user-queue spec: rolloutStrategy: type: RollingUpdate rollingUpdateConfiguration: maxUnavailable: 2 maxSurge: 2 ...
- Wait for pods to be created successfully
- Change something on the template, so an update can happen. For example, in my template I've changed the sleep time from 3600 -> 2600 (template is attached to this bug)
- Reapply the template and wait for the update to happen
Actual: Replica pods are getting stuck in a Pending state, Update is not completed.
Expected: Update should happen successfully. Replica pods should be created in order to help the update and, once it has finished, they should be deleted.
Ps: One thing observed is that, without Kueue label update runs successfully.
- is related to
-
OCPKUEUE-363 Feature Verification for LeaderWorkerSet
-
- Closed
-
-
OCPKUEUE-514 [Release] - Bug verification
-
- To Do
-
-
OCPKUEUE-487 LWS Rollout Update - Update does not happen on LWS pods
-
- Closed
-
- links to