Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-12905

Submariner on ROKS DR clusters didnt recover after Gateway Node shutdown.

XMLWordPrintable

    • False
    • None
    • False
    • Submariner Sprint 2024-27, Submariner Sprint 2024-28
    • None

      Description of problem:

      I filed this issue to track DR issue on ROKS clusters reported on Slack.

      Message from Slack : 

       

      DR clusters, they didnt recover after Gateway Node shutdown.
      ---Step1---
      On cluster : shaikh-roks415-dr-tok-odf6
      Gateway node shutdown: 10.244.0.7
      -----Step2---
      Submariner communication Broke
      ----Step3---
      Annotated node 10.244.0.8 as GatewayNode and
      Untagged 10.244.0.7 as GatewayNode
      Still "verify" Test Failed"
      ----Step4---
      Made Node 10.244.0.7 comeback
      Still things are unhealthy
      
      Version-Release number of selected component (if applicable):
      oc get pods -n submariner-operator -o wide
      NAME                                             READY   STATUS                 RESTARTS   AGE     IP              NODE         NOMINATED NODE   READINESS GATES
      query-iface-listhjb4d                            0/1     NodeAffinity           0          6d15h   <none>          10.244.0.7   <none>           <none>
      query-iface-listx49jm                            0/1     NodeAffinity           0          6d15h   <none>          10.244.0.7   <none>           <none>
      submariner-addon-7b5ccb6568-xhlzq                1/1     Running                0          6d15h   172.29.81.1     10.244.0.9   <none>           <none>
      submariner-gateway-tvphv                         1/1     Terminating            0          5d14h   10.244.0.8      10.244.0.8   <none>           <none>
      submariner-lighthouse-agent-75ffbd48d-qqs4c      1/1     Running                0          4d9h    172.29.80.193   10.244.0.9   <none>           <none>
      submariner-lighthouse-coredns-67c7654989-sh7wh   1/1     Running                0          6d15h   172.29.234.13   10.244.0.8   <none>           <none>
      submariner-lighthouse-coredns-67c7654989-vdj57   1/1     Running                0          12d     172.29.80.209   10.244.0.9   <none>           <none>
      submariner-metrics-proxy-tfwkj                   0/1     CreateContainerError   0          5d14h   172.29.234.36   10.244.0.8   <none>           <none>
      submariner-operator-6f58d65564-rxnf9             1/1     Running                0          12d     172.29.80.225   10.244.0.9   <none>           <none>
      submariner-routeagent-l5fd2                      1/1     Running                0          12d     10.244.0.8      10.244.0.8   <none>           <none>
      submariner-routeagent-m7g59                      1/1     Running                0          12d     10.244.0.9      10.244.0.9   <none>           <none>
      submariner-routeagent-zmq9m                      1/1     Running                1          12d     10.244.0.7      10.244.0.7   <none>           <none> 

       

      How reproducible:

      Steps to Reproduce:

      1.  
      2.  
      3. ...

      Actual results:

      Expected results:

      Additional info:

            yboaron Yossi Boaron
            yboaron Yossi Boaron
            Maxim Babushkin Maxim Babushkin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: