Loading...

XML

Word

Printable

The Customer is requesting feature enhancement.

The high availability guaranteed by OpenShift doesn’t support our rapid recovery requirements in case of a broker node failure. The experienced recovery time — until the Pod is killed, recreated, and reaches the “ready” state again — in case of broker issues (e.g. out-of-memory) can take up to 1–2 minutes. The messages stored on the persistent volume of this failed broker are not available during this recovery time, which means we can’t fulfill the end-to-end process time requirements in case of our time-sensitive and business-critical use cases.

Shared-storage base HA (supported on bare metal and VMs) would provide much shorter recovery times—with client-side failover— but this master-slave HA architecture is not supported on OpenShift/ROSA according to

. We use ODF (Ceph) persistent volumes with ROSA, which is supported storage for shared-storage-based HA according to

The AMQ Broker Operator — or other recommended deployment process — should support the shared-storage-based High Availability architecture to achieve quicker recovery times on OpenShift/ROSA. A similar issue is described in

relates to

ENTMQBR-7500 Improve the availability of brokers in Openshift