-
Bug
-
Resolution: Done
-
Undefined
-
ACM 2.10.5
-
None
-
False
-
None
-
False
-
-
-
Submariner Sprint 2024-27, Submariner Sprint 2024-28, Submariner Sprint 2024-29
-
None
Description of problem:
I filed this issue to track DR issue on ROKS clusters reported on Slack.
Message from Slack :
DR clusters, they didnt recover after Gateway Node shutdown. ---Step1--- On cluster : shaikh-roks415-dr-tok-odf6 Gateway node shutdown: 10.244.0.7 -----Step2--- Submariner communication Broke ----Step3--- Annotated node 10.244.0.8 as GatewayNode and Untagged 10.244.0.7 as GatewayNode Still "verify" Test Failed" ----Step4--- Made Node 10.244.0.7 comeback Still things are unhealthy Version-Release number of selected component (if applicable):
oc get pods -n submariner-operator -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES query-iface-listhjb4d 0/1 NodeAffinity 0 6d15h <none> 10.244.0.7 <none> <none> query-iface-listx49jm 0/1 NodeAffinity 0 6d15h <none> 10.244.0.7 <none> <none> submariner-addon-7b5ccb6568-xhlzq 1/1 Running 0 6d15h 172.29.81.1 10.244.0.9 <none> <none> submariner-gateway-tvphv 1/1 Terminating 0 5d14h 10.244.0.8 10.244.0.8 <none> <none> submariner-lighthouse-agent-75ffbd48d-qqs4c 1/1 Running 0 4d9h 172.29.80.193 10.244.0.9 <none> <none> submariner-lighthouse-coredns-67c7654989-sh7wh 1/1 Running 0 6d15h 172.29.234.13 10.244.0.8 <none> <none> submariner-lighthouse-coredns-67c7654989-vdj57 1/1 Running 0 12d 172.29.80.209 10.244.0.9 <none> <none> submariner-metrics-proxy-tfwkj 0/1 CreateContainerError 0 5d14h 172.29.234.36 10.244.0.8 <none> <none> submariner-operator-6f58d65564-rxnf9 1/1 Running 0 12d 172.29.80.225 10.244.0.9 <none> <none> submariner-routeagent-l5fd2 1/1 Running 0 12d 10.244.0.8 10.244.0.8 <none> <none> submariner-routeagent-m7g59 1/1 Running 0 12d 10.244.0.9 10.244.0.9 <none> <none> submariner-routeagent-zmq9m 1/1 Running 1 12d 10.244.0.7 10.244.0.7 <none> <none>
How reproducible:
Steps to Reproduce:
- ...