-
Feature Request
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
None
-
Product / Portfolio Work
-
None
-
False
-
-
None
-
None
-
None
-
-
None
-
None
-
None
-
None
-
None
1. Proposed title of this feature request
Allow node drains to better work with IBM MQ pods
2. What is the nature and description of the request?
If IBM MQ is deployed in a Native HA setup, the pods work in an active/standby setup with 1 active pod and 2 standby pods. A quorum (2/3 pods) needs to be active to guarantee High Availability (HA).
If all nodes have to be drained as part of an Machine Config Operator (MCO) update, then a situation can occur where the nodes are drained too fast for the pods' quorum to be maintained. Normally, the PodDisruptionBudget (PDB) resource can be used to halt the drain process so that if an evicted pod would break the quorum, the pod eviction won't happen until another replica pod comes up.
However, this won't work with IBM MQ because only the active pod will be in a Ready (1/1) state and the standby pods will remain in the NotReady (0/1) state.
NAME READY STATUS RESTARTS AGE IP ibm-mq-0 0/1 Running 0 12h 172.19.0.2 ibm-mq-1 1/1 Running 0 12h 172.19.0.3 ibm-mq-2 0/1 Running 0 12h 172.19.0.4
3. Why does the customer need this? (List the business requirements here)
During node drains, the customer keeps having outages in the IBM MQ pods for 5-10 minutes while quorum is restored.
4. List any affected packages or components.
Machine Config Controller, Machine Config Daemon, Kubectl (drain)