Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-7844

[RDR] Globalnet IP gets reallocated due to the race condition on GlobalNet and HA enabled environment

XMLWordPrintable

    • Submariner Sprint 2023-11, Submariner Sprint 2023-12, Submariner Sprint 2023-13
    • Moderate
    • No

      Description of problem:

      On RDR longevity setup, on one of the managed clusters C2, Globalnet IP gets reallocated due to the race condition on GlobalNet and HA enabled environment. This causes the lighthouse component to return wrong IP for a rook ns lookup query, which in turn stopped the mirroring from C2 to C1.

      Version-Release number of selected component (if applicable):

      ODF- 4.1.4.0-128
      OCP - 4.14.0-0.nightly-2023-09-12-024050
      Submariner - 0.16 (brew.registry.redhat.io/rh-osbs/iib:569163)
      ACM - v2.9.0-109 (2.9.0-DOWNSTREAM-2023-08-24-09-30-12)
      ceph version 17.2.6-120.el9cp (6fb9bb1d83813766a53a421c7bc80f7835bcaf6c) quincy (stable)
       

      How reproducible:

      Steps to Reproduce:

      1. On Regional DR longevity setup which has been running for more 2 weeks perform failover(from C1 to C2) and relocate back(C2 to C1) of an app, operation was successful
      2. Keep the cluster running a day, mirrioring is lost from C2 to C1

      Actual results:

      On one of the managed clusters C2, Globalnet IP gets reallocated due to the race condition on GlobalNet and HA enabled environment. There was a race condition in the Globalnet controller code which will be seen only during GW migration. This causes the lighthouse component to return wrong IP for a rook ns lookup query, which in turn stopped the mirroring from C2 to C1.

      Expected results:

      Globalnet IPs should not be updated/reallocated.

      Subctl gather logs

      http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/keerthana/Longevity-4.14/submariner/Globalnet/

       

      Additional info:

      Slack discussion of RCA 

      https://redhat-internal.slack.com/archives/C0134E73VH6/p1696321927050439
      https://redhat-internal.slack.com/archives/C0134E73VH6/p1696348571144659
      https://redhat-internal.slack.com/archives/C0134E73VH6/p1696495985131449

              tpanteli Thomas Pantelis
              kmanohar@redhat.com Keerthana Manoharan
              Maxim Babushkin Maxim Babushkin
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: