-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
rhos-17.1.4
-
None
-
False
-
-
False
-
?
-
None
-
-
-
Low
To Reproduce
We did not reproduce the issue but based on our understanding and the event list:
- Customer stopped the instances on a compute that had HW issue (compute service was down) before trying to evacute them
- Then they evacuated the instances and they went up fine on new destination compute
- When they fixed source compute HW issues and brought it back online, the instances that were evacuated got stopped
- Customer manually started the instances successfully
- We suspect that the stop action from 1. somehow got queued while the the source compute was down, and trigered unwanted actions when it came back up.
Expected behavior
- After an evac, instances running on destination compute should not be impacted once the source compute is back online
Bug impact
- Tenant workload went down unexpectedly
Known workaround
- Reboot instances bring the workload up again but no known workaround to prevent them from stopping until we know why the instances stopped
Additional context
- If the stop action issued by the customer before the evacuate is responsible for this issue, I think it is a bug.
- I will update with more information.