-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
1.2.11.Final, 1.3.1.Final
-
None
Credit where it's due: the issue was first spotted by rhatlapa.
Problem
It appears that trying to send to all interfaces with NULL or "0.0.0.0" – the default bindaddr when no AdvertiseBindAddress is set – in the following statement actually picks the first non-loopback interface and sends to it.
if ((rv = apr_sockaddr_info_get(&ma_listen_sa, bindaddr, ma_mgroup_sa->family, bindport, APR_UNSPEC, pool)) != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ERR, rv, s, "mod_advertise: ma_group_join apr_sockaddr_info_get(%s:%d) failed", bindaddr, bindport);
The result is that there is no datagram on other interfaces. Surprisingly, this is not deterministic though: After dozens or hundreds of messages, eventually one datagram reaches another interface.
Impact
Picture this simple scenario: There are two interfaces, e.g.
enp1s0 10.16.88.187 enp2s0 172.18.0.1
listed in this exact order with ip addr show.
One has an EAP 7 (Wildfly 10) instance with mod_cluster bound to 172.18.0.1 IP address, which implies enp2s0 interface.
Furthermore, one has an Apache HTTP Server instance with mod_cluster bound to 172.18.0.1 IP address, i.e. MCMP VirtualHost and main VirtualHost all Listen on this IP address.
Result: Without advertising, using an explicit proxy-list, all is well. MCMP works, requests work, balancing works.
On the other hand, relying on advertisement, it could take EAP 7 (Wildfly 10) minutes to register with the balancer.
The reason is that a vast majority of UDP Multicast datagrams arrives at enp1s0 and EAP 7 (Wildfly 10) doesn't see them.
Reproducer
Lemme demonstrate with a recently refactored advertise.c utility for sending datagrams and the well known Advertize.java utility for receiving them.
Your your convenience, here are binaries built from the aforementioned sources:
- Advertize java utility: Advertize.class
- advertise native utility (Linux3 x86_64): advertise-linux3_x86_64.zip
- advertise native utility (WIndows x86): advertise-windows_x86.zip
Demonstration on Linux
System
[mbabacek@perf09 ~]$ ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:18:8b:7a:46:04 brd ff:ff:ff:ff:ff:ff inet 10.16.88.187/21 brd 10.16.95.255 scope global enp1s0 valid_lft forever preferred_lft forever inet 10.16.93.253/21 brd 10.16.95.255 scope global secondary enp1s0 valid_lft forever preferred_lft forever 3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:18:8b:7a:46:05 brd ff:ff:ff:ff:ff:ff inet 172.17.72.254/19 brd 172.17.95.255 scope global enp2s0 valid_lft forever preferred_lft forever 4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN link/ether 02:42:07:ab:74:f9 brd ff:ff:ff:ff:ff:ff inet 172.18.0.1/16 scope global docker0 valid_lft forever preferred_lft forever
Java
[mbabacek@perf09 ~]$ java -version openjdk version "1.8.0_71" OpenJDK Runtime Environment (build 1.8.0_71-b15) OpenJDK 64-Bit Server VM (build 25.71-b15, mixed mode)
Advertise SENT
[mbabacek@perf09 ~]$ date;./advertise -a 224.0.1.102 -p 33364 Mon Mar 21 12:39:51 EDT 2016 UDP Multicast address to send datagrams to. Value: 224.0.1.102 UDP Multicast port. Value: 33364 IP address of the NIC to bound to. Value: NULL apr_socket_bind on 0.0.0.0:0 apr_mcast_join on 0.0.0.0:0 apr_socket_sendto to 224.0.1.102:33364
Advertize RECEIVED
YES
[mbabacek@perf09 ~]$ java Advertize 224.0.1.102 33364 Linux like OS ready waiting... received: Advertize !!! Mon, 21 Mar 2016 16:39:51 GMT received from /10.16.88.187:38907
YES
[mbabacek@perf09 ~]$ java Advertize 224.0.1.102 33364 10.16.88.187 Linux like OS ready waiting... received: Advertize !!! Mon, 21 Mar 2016 16:39:51 GMT received from /10.16.88.187:38907
NO
[mbabacek@perf09 ~]$ java Advertize 224.0.1.102 33364 172.17.72.254 Linux like OS ready waiting...
YES
[mbabacek@perf09 ~]$ java Advertize 224.0.1.102 33364 0.0.0.0 Linux like OS ready waiting... received: Advertize !!! Mon, 21 Mar 2016 16:39:51 GMT received from /10.16.88.187:38907
And now let's take a look at 172.17.72.254, i.e. enp2s0
Advertise SENT
[mbabacek@perf09 ~]$ date;./advertise -a 224.0.1.102 -p 33364 -n 172.17.72.254 Mon Mar 21 12:42:57 EDT 2016 UDP Multicast address to send datagrams to. Value: 224.0.1.102 UDP Multicast port. Value: 33364 IP address of the NIC to bound to. Value: 172.17.72.254 apr_socket_bind on 172.17.72.254:0 apr_mcast_join on 172.17.72.254:0 apr_socket_sendto to 224.0.1.102:33364
Advertize RECEIVED
NO
[mbabacek@perf09 ~]$ java Advertize 224.0.1.102 33364 Linux like OS ready waiting...
NO
[mbabacek@perf09 ~]$ java Advertize 224.0.1.102 33364 10.16.88.187 Linux like OS ready waiting...
YES
[mbabacek@perf09 ~]$ java Advertize 224.0.1.102 33364 172.17.72.254 Linux like OS ready waiting... received: Advertize !!! Mon, 21 Mar 2016 16:42:57 GMT received from /172.17.72.254:35452
NO
[mbabacek@perf09 ~]$ java Advertize 224.0.1.102 33364 0.0.0.0 Linux like OS ready waiting...
Demonstration on Windows
One could note that the problem doesn't exist on Windows. All interfaces receive advertising.
Advertise SENT
C:\Users\karm\advertise-build λ advertise.exe -a 224.0.1.102 -p 33364 UDP Multicast address to send datagrams to. Value: 224.0.1.102 UDP Multicast port. Value: 33364 IP address of the NIC to bound to. Value: NULL apr_socket_bind on 0.0.0.0:0 apr_mcast_join on 0.0.0.0:0 apr_socket_sendto to 224.0.1.102:33364
Advertize RECEIVED
YES
C:\Users\karm\WORKSPACE λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 ready waiting... received: Advertize !!! Mon, 21 Mar 2016 18:07:50 GMT received from /192.168.122.52:61805
YES
C:\Users\karm\WORKSPACE λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 192.168.122.52 ready waiting... received: Advertize !!! Mon, 21 Mar 2016 18:07:50 GMT received from /192.168.122.52:61805
YES
C:\Users\karm\WORKSPACE λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 192.168.122.199 ready waiting... received: Advertize !!! Mon, 21 Mar 2016 18:07:50 GMT received from /192.168.122.52:61805
Advertise SENT
C:\Users\karm\advertise-build λ advertise.exe -a 224.0.1.102 -p 33364 -n 192.168.122.199 UDP Multicast address to send datagrams to. Value: 224.0.1.102 UDP Multicast port. Value: 33364 IP address of the NIC to bound to. Value: 192.168.122.199 apr_socket_bind on 192.168.122.199:0 apr_mcast_join on 192.168.122.199:0 apr_socket_sendto to 224.0.1.102:33364
Advertize RECEIVED
YES
C:\Users\karm\WORKSPACE λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 ready waiting... received: Advertize !!! Mon, 21 Mar 2016 18:09:55 GMT received from /192.168.122.199:52781
YES
C:\Users\karm\WORKSPACE λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 192.168.122.52 ready waiting... received: Advertize !!! Mon, 21 Mar 2016 18:09:55 GMT received from /192.168.122.199:52781
YES
C:\Users\karm\WORKSPACE λ "C:\Program Files\Java\jdk1.8.0_74\bin\java" Advertize 224.0.1.102 33364 192.168.122.199 ready waiting... received: Advertize !!! Mon, 21 Mar 2016 18:09:55 GMT received from /192.168.122.199:52781
Suggestion
Ideas? rhn-engineering-jclere, rhn-engineering-rhusar
I suggest setting bindaddr (AdvertiseBindAddress) default to main_server's address or MCMP enabled vhost instead of NULL. I'll post a PR for evaluation.
- is incorporated by
-
MODCLUSTER-531 Eliminate automagic (tracker)
- Closed
- relates to
-
MODCLUSTER-495 mod_cluster UDP multicast listener does not honour the proper interface
- Resolved
-
JWS-407 mod_cluster UDP multicast listener does not honour the proper interface
- Closed