Loading...

XML

Word

Printable

Type: Epic
Resolution: Won't Do
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- platform-devprod

Epic Name:
ephemeral-cluster-outage-mitigation-jan24
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
BZ requires_doc_text:
Unset
Epic Status:
To Do
Hierarchy Progress Bar:

0% To Do, 0% In Progress, 100% Done
BZ Keywords:
- Unset
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Summary and goal

A failed OpenShift automated upgrade rendered the Ephemeral cluster unusable. Devprod Team needs to work together to return to service.

Acceptance Criteria

Hot Swap to CRCD in order to restore service to our users
Create new Ephemeral Cluster
Ensure we have solid documentation for creating a new Ephemeral cluster
Work with OSD to attempt to return the old Ephemeral cluster to service
Document RCA and Post Mortem after all technical work is complete

Open Questions

Why did this happen?
How can we prevent it in the future?
What larger implications does this event have for business continuity and disaster recovery
Do we have the processes and resources in place to deal with situations like this in the future?

There are no Sub-Tasks for this issue.

Assignee:: Unassigned

Reporter:: Adam Drew

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/01/05 2:23 PM

Updated:: 2024/12/10 6:30 PM

Resolved:: 2024/12/10 6:30 PM