-
Feature Request
-
Resolution: Unresolved
-
Normal
-
None
-
CNV v4.10.0
-
8
-
False
-
-
False
-
33% To Do, 0% In Progress, 67% Done
-
---
-
---
Feature Overview
- The customer is experiencing long VM restart time of ~15 minutes when a host fails.
Goals
- The expected user outcome is for the VM to restart within seconds on another node in case of a host failure, allowing applications/servers to have minimal downtime.
Requirements
A list of specific needs or objectives that a Feature must deliver to satisfy the Feature. Some requirements will be flagged as MVP. If an MVP gets shifted, the feature shifts. If a non MVP requirement slips, it does not shift the feature.
- The customer is requesting information on the below questions
Requirement | Notes | isMvp? |
---|---|---|
CI - MUST be running successfully with test automation | This is a requirement for ALL features. | YES |
Release Technical Enablement | Provide necessary release enablement details and documents. | YES |
Questions to answer
- What system checks or health checks are performed on an IPI installation with the MachineHealthCheck controller?
- With the MachineHealthCheck controller, how much time or range of time will the VM take to restart?
Background, and strategic fit
This Section: What does the person writing code, testing, documenting need to know? What context can be provided to frame this feature.
Documentation Considerations
Questions to be addressed:
- What educational or reference material (docs) is required to support this product feature? For users/admins? Other functions (security officers, etc)?
- Customer needs a reference material to show what health checks are done in the case of a host failure and best practices they need to take in order to minimize downtime
- Does this feature have doc impact?
- Yes
- What concepts do customers need to understand to be successful in [action]?
- The customer needs further information on the above Questions to Answer
- How do we expect customers will use the feature? For what purpose(s)?
- They will use this feature to minimize VM downtime due to a host failure
- What reference material might a customer want/need to complete [action]?
-
- Documentation listing steps they need to take to remediate host failure and how long the VM may need to take to restart on another node
- Is there source material that can be used as reference for the Technical Writer in writing the content? If yes, please link if available.
- N/A.
- What is the doc impact (New Content, Updates to existing content, or Release Note)?
- Release Note