-
Bug
-
Resolution: Won't Do
-
Major
-
2.4.1 SP4
-
None
-
Workaround Exists
-
When TCPPING is used with port_range, i.e there are more than one possible ports to bind to, jvm may run into random OOMs.
The issue is with the local port bind in the ConnectionTable.java in JGroups-2.4.1-sp4.src/src/org/jgroups/blocks. Ping requests are send to all ports in the port range on all boxes in the cluster including the local box. When these requests try to connect to unsed ports in the range on the local box, a local bind is done in the getConnection method before a connect is called. This bind call may end up with a local port number which is the same as the unused port in the port range that the connection is being established for.
This intern will allow the connect to go through even though there is no accept thread waiting on it.
Connection getConnection(Address dest) throws Exception { Connection conn=null; Socket sock; synchronized(conns) { conn=(Connection)conns.get(dest); if(conn == null) { // changed by bela Jan 18 2004: use the bind address for the client sockets as well SocketAddress tmpBindAddr=new InetSocketAddress(bind_addr, 0); InetAddress tmpDest=((IpAddress)dest).getIpAddress(); SocketAddress destAddr=new InetSocketAddress(tmpDest, ((IpAddress)dest).getPort()); sock=new Socket(); sock.bind(tmpBindAddr); sock.setKeepAlive(true); sock.setTcpNoDelay(tcp_nodelay); if(linger > 0) sock.setSoLinger(true, linger); else sock.setSoLinger(false, -1); sock.connect(destAddr, sock_conn_timeout);
This results in a connection where the local address is sent to the other side, but there is no accept thread to read it out.
conn=new Connection(sock, dest); conn.sendLocalAddress(local_addr); notifyConnectionOpened(dest); // conns.put(dest, conn); addConnection(dest, conn); conn.init(); if(log.isInfoEnabled()) log.info("created socket to " + dest); } return conn; }
When this value is not read out before the reader thread is started, it is eventually read in as the lenght to allocate for reading the packet in BasicConnectionTable.java in JGroups-2.4.1-sp4.src/src/org/jgroups/blocks.
len=in.readInt(); if(len > buf.length) buf=new byte[len]; in.readFully(buf, 0, len); updateLastAccessed(); receive(peer_addr, buf, 0, len); // calls receiver.receive(msg)
This in our case was allocating 1.6G of memory and sometimes would run out of memory in other parts of the program depending on how much memory was in use at that time.
A test program that reproduces the port collision is attached. Sample invocation below.
bash-2.05$ java Test vlinux101
Connected : 6789 to 6789 on try 46267
bash-2.05$
- is related to
-
JGRP-1530 TCPConnectionMap$TCPConnection$ConnectionPeerReceiver allocate and hold a 1.6G byte[]
- Resolved