Loading...

XML

Word

Printable

Type: Bug
Resolution: Not a Bug
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Multicluster Networking
Labels:
- RDR-Blocker

Blocked:
True
Blocked Reason:
Data sync is impacted, no known workaround at this point of time.
Ready:
False
Intelligence Requested:
Market:

Severity:
Important

Regression:
No

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Description of problem:

Version-Release number of selected component (if applicable):

OCP 4.15.0-0.nightly-2024-03-05-113700
ACM 2.10.0-DOWNSTREAM-2024-02-28-06-06-55
ODF 4.15.0-157
ceph version 17.2.6-196.el9cp (cbbf2cfb549196ca18c0c9caff9124d83ed681a4) quincy (stable)
Submariner brew.registry.redhat.io/rh-osbs/iib:680159

How reproducible:

Steps to Reproduce:

****Active hub co-situated with primary managed cluster****

1. On a Regional DR setup,
perform site failure (active hub and the primary managed cluster goes down) and moving to passive hub post hub recovery, all the CephFS workloads of both subscription and appset types and in different states Deployed, FailedOver, Relocated which were running on primary managed cluster were failedover to the failovercluster (secondary) and the failover operation was successful.

Workloads are successfully running on the failovercluster (secondary) and VRG both states are marked as Primary for all these workloads.

2. Now recover the older primary managed cluster and ensure it's successfully imported on the RHACM console (if not, create auto-import-secret for this cluster on the passive hub).
3. Monitor drpc cleanup status and lastGroupSyncTime for all the failedover workloads.
4. After successful cleanup, let IOs continue for a few days and monitor the sync progress, lastGroupSyncTime etc.

Actual results: [RDR] [Hub recovery] [Co-situated] Data sync for all cephfs workloads gets impacted while running IOs post successful failover and cleanup

Expected results: Data sync should progress as expected and submariner connectivity issue shouldn't be seen.

Additional info:

Slack thread- https://redhat-internal.slack.com/archives/C0134E73VH6/p1710874024678819

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

subctl diagnose.rtf
21 kB
2024/03/20 12:42 PM
subctl service discovery.txt
122 kB
2024/03/20 12:43 PM
subctl service discovery re-try with context.txt
7 kB
2024/03/20 3:03 PM
subctl verify-post hub recovery.txt
111 kB
2024/03/20 12:39 PM

Assignee:: Thomas Pantelis

Reporter:: Aman Agrawal

QA Contact:: Maxim Babushkin

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/03/20 7:38 AM

Updated:: 2024/03/21 11:36 AM

Resolved:: 2024/03/21 6:49 AM

Details

Description

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results: [RDR] [Hub recovery] [Co-situated] Data sync for all cephfs workloads gets impacted while running IOs post successful failover and cleanup

Expected results: Data sync should progress as expected and submariner connectivity issue shouldn't be seen.

Additional info:

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates