-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
1.10.1.GA
-
None
-
False
-
None
-
False
-
-
In a 3-router mesh deployment on OpenShift 4.8.35, we observe the following behavior when we drain an application node containing one of the routers.
1. The remaining routers that are not moved continue to function as normal
2. The moved router starts continuously logging "no route to host" warnings in the logs, in the following format:
2022-05-13 19:32:56.598748 +0000 SERVER (info) [C113] Connection to 10.210.46.90:55672 failed: proton:io No route to host - disconnected 10.210.46.90:55672
3. The IP address in these log entries is the former address of the moved router (e.g. as if the router is trying to connect to itself on its old IP address)
4. We can see applications connect to the router, but it appears deliveries remain stuck / unsettled for these connections:
2022-05-13 19:36:07.127228 +0000 ROUTER_CORE (info) [C2][L40] Stuck delivery: At least one delivery on this link has been undelivered/unsettled for more than 10 seconds
It appears that somewhere the old IP address of the router is not removed and the router is attempting the add a connector to its old IP address. It is unclear whether this is related to the issue with unsettled deliveries or is just another manifestation of the underlying cause.
Other notes: Along with one of the routers, several of the client application pods were also migrated.
Restarting / killing the router seems to resolve the issue - when it comes back, message flow resumes.