-
Epic
-
Resolution: Done
-
Critical
-
None
-
CNV Infra 243, CNV Infra Next
Goal
- Internally document how much time it takes a VM to recover is the node dies.
- Document in a knowledge base article which configurations are possible and tested.
- Find opportunities/config change to optimize the eviction from the node / recovery time.
User Stories
- As a VM owner I like that my VM becomes as quick as possible to be recovered if the node running the VM dies, so that my application has only little downtime, even if the node is rebooting with a high frequency.
Non-Requirements
- We will start with NHC, MHC is a possible follow up.
Notes
- Maybe the SAP cluster could be used?
-
- Geetika, is the cluster good enough for the performance questions?
- 6 nodes or 3 nodes cluster?
- Ronen
- On which remediator should the scneario be focused? SNR, FAR or Metal3?
- Ronen
- There might be help from the virt team required to tune the VM.
- There might be help from the NHC for the tuning required.
- There is a matrix of combinations which influence the time:
- Remediators (SNR + FAR)
- Health Check
- Cluster size (3 node, 6 node)
-> we have to start with one combination, and can extend to another scenario
- Start with cnv 4.13, the article might refer to cnv 4.14
- is cloned by
-
CNV-39369 Update Knowledge base article an VM recovery time on CNV 4.16
- Closed
- is depended on by
-
CNV-36134 Reduce time to redeploy VM scheduled on unhealthy node on 4.15.1
- Closed
- relates to
-
CNV-31893 BMC Event driven fencing
- In Progress
-
CNV-25645 Target additional testing for NHC and SNR with compact clusters
- Closed
- links to
(3 links to)