Hi,
We use the gossip tunnel approach to connect infinispan nodes. When testing this setup in a docker environment with a docker bridge network setup we encountered stability issues where the cause seemed to be the network. We experienced connection loss and no heartbeat received errors.
While trying to get this configuration to work we tried different configurations and turned on trace logging. Also we created some wireshard analysis of the network traffic. We tested on windows docker container, linux docker containers on linux and on linux docker containers running in WSL2.
The wireshark network traces showed continuous searching for all the hosts which were created using the port_range settings. This for us meant our configured 2 hosts expanded to 18 additional hosts which needed to be scanned. We have no need for this port_range scanning and found it weird that it did not stop when the original hosts were found so we changed this setting to port_range=0 without expecting it to affect the stability. At this moment things started working without stability issues. Everybody happy
We did find it weird this port_range settings was able to influence the behavior of the network stability. Maybe this is something you can investigate? Could it be related to not having separate threads for discovery and handling of cluster traffic or something else in this area?
We also noticed a side issue where the port_range did not start to work for us until we used the external_addr field with it. If we don't use the external_addr field the port_range=0 configuration was not applied.