-
Epic
-
Resolution: Done
-
Critical
-
None
-
Loki - Cluster Restart Hardening
-
3
-
False
-
None
-
False
-
Green
-
NEW
-
Done
-
OBSDA-309 - PodDisruptionBudget for LokiStack to keep the stack up and running and under control during OpenShift Container Platform 4 - upgrades
-
OBSDA-309PodDisruptionBudget for LokiStack to keep the stack up and running and under control during OpenShift Container Platform 4 - upgrades
-
VERIFIED
-
0% To Do, 0% In Progress, 100% Done
-
With this update, the Loki Operator introduces PodDisruptionBudget configuration on LokiStack deployments to ensure normal operations during OCP cluster restarts by keeping ingestion and the query path available.
-
Enhancement
Goals
- Use available pod disruption primitives to harden the LokiStack reliability during OCP cluster restarts
- Keep the LokiStack ingestion path working while the OCP cluster is restarting
- Keep the LokiStack query path working while the OCP cluster is restarting
Non-Goals
- Enable user-customizable disruption configuration for each individual LokiStack component.
- Dynamically adjust the pod disruption configuration by operator-managed automation.
Motivation
In OpenShift Container Platform 4, updates are applied based on MachineConfigPool level, requiring customers to apply PodDisruptionBudget to prevent undesired disruption when OpenShift Container Platform 4 - Nodes are being updated/rebooted.
LokiStack is missing PodDisruptionBudget configuration, which could trigger all OpenShift Container Platform 4 - Nodes, hosting such components to be updated at the same time and therefore restart the entire service at the same time, which may introcued undesired service disruption.
Alternatives
Acceptance Criteria
- Any LokiStack deployment size supports OCP cluster restarts without human administrator attendance.
- Any LokiStack path (ingestion/query) keeps operating within the available boundaries of node resources (CPU/Memory) during OCP cluster restarts.
Risk and Assumptions
Documentation Considerations
PodDisruptionBudget are already well documented in the official OpenShift Container Platform documentation pages (See here). However our Logging docs should have some sort of banner that we explains how the LokiStack will behave during cluster restarts, e.g. explaning the effect of each PodDisruptionBudget we place.
Open Questions
Additional Notes
- is cloned by
-
OBSDOCS-214 Loki - Cluster Restart Hardening
- Closed
- is documented by
-
OBSDOCS-214 Loki - Cluster Restart Hardening
- Closed
- links to
- mentioned in
-
Page Loading...