-
Bug
-
Resolution: Done-Errata
-
Major
-
None
-
ACM 2.9.0
-
True
-
Data sync between the managed clusters stops.
-
False
-
The issue shouldn't be seen on a RDR setup and seemless submariner connectivity should be maintained between the managed clusters with or without undergoing node related operations to allow data sync and perform recovery operations on it.
-
-
-
Submariner Sprint 2023-11, Submariner Sprint 2023-12
-
Important
-
No
Description of problem: Data sync between managed clusters stops on a Regional DR setup. The issue has been very consistent for cephfs based workloads but was also seen with rbd in a few cases.
Version-Release number of selected component (if applicable):
ODF 4.14.0-132.stable
OCP 4.14.0-0.nightly-2023-09-02-132842
ACM 2.9.0-DOWNSTREAM-2023-08-24-09-30-12
subctl version: v0.16.0
ceph version 17.2.6-138.el9cp (b488c8dad42b2ecffcd96f3d76eeeecce48b8590) quincy (stable)
image: brew.registry.redhat.io/rh-osbs/iib:507564
How reproducible: Have seen multiple times on different test setups
Steps to Reproduce:
1. Create a Regional DR setup and deploy cephfs and rbd based DR protected workloads.
2. Continue running IOs for 1-2 weeks and monitor data sync b/w the clusters (with ODF DR perspective).
3. Perform failover/relocate and perform the same checks.
Actual results: IPsec whack returning exit status 33 while setting up tunnels
Expected results: The issue shouldn't be seen and seemless submariner connectivity should be maintained between the managed clusters with or without undergoing node related operations.
Additional info:
The issue is being discussed here- https://redhat-internal.slack.com/archives/C0134E73VH6/p1695210573900509
- links to
-
RHEA-2023:114022 RHEA: Submariner 0.16.0 - bug fix and enhancement update