Loading...

XML

Word

Printable

Type: Epic
Resolution: Unresolved
Priority: Normal
Fix Version/s: rhos-18 Feature Release 2
Affects Version/s: None
Component/s: None
Labels:
None

Epic Name:
[RFE] configurable scheduling back to a compute node after instance ha fences it
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Parent Link:
OSPRH-3351Investigate InstanceHA in nextgen
Dev Approval:
Proposed
Docs Approval:
?
Epic Status:
To Do
Feature Link:
OSPRH-3351 - Investigate InstanceHA in nextgen
PM Approval:
?
QE Approval:
?
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Description of problem:

Customer does not like the fact that instance ha automatically starts nova-compute after the compute node comes back up from fencing. The reasoning is most of the failures they see are around memory going bad, so if memory goes bad and the compute comes back up without that dimm , therefore less ram the compute still isn't ready for usage. If vms start getting scheduled there again after fencing they still have to manually disable compute service and migrate vms off to fix the hardware. We are wondering if there is a way to have the admin confirm the compute is good before allowing scheduling to continue to that node? We tried to play with disabling compute unfence trigger since the docs say that is what unfences the node when it comes back up; that didn't work. Manually disabling the compute service doesn't seem like a good option either since the admin may not know exactly when fencing happens.

https://bugzilla.redhat.com/show_bug.cgi?id=2120768

Assignee:: Luca Miccini

Reporter:: Luca Miccini

Team:: rhos-dfg-pidone

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2024/01/05 3:00 PM

Updated:: 2024/06/21 4:52 AM

Details

Description

Attachments

Activity

People

Dates

PagerDuty