Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1117

BasicConnectionTable.Connection: buffer only increases, never shrinks

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 2.4.8, 2.6.14, 2.8
    • None
    • None

    Description

      The byte[] buffer in run() is increased with message size. However, when we accidentally read a wrong length field, e.g. as the result of someone connecting and sending crap, the buffer is way too big.

      SOLUTION: create the buffer anew for every message.

      [Email from Jimmy Wilson]
      One of the issues appears to lead to a 1.5GB byte buffer inside ConnectionTable that is never reclaimed (it's the latest issue discussed on the case). We have heap dumps showing that this byte buffer grew to this size.

      Dennis's comments from the ticket:

      > The buffer appears to be the re-sizeable one from org.jgroups.blocks.BasicConnectionTable.Connection.run().

      > The contents of the buffer is:
      > JGroups version (2.4.5)
      > IP Address (172.20.30.15:7900)
      > list of: length (53)
      > org.jgroups.Message (TCPPING for null-Data channel)

      > The full data JGroups sends on a TCP connection is:
      > Cookie ('b','e','l','a')
      > JGroups version
      > IP address
      > list of: length, Message

      > Connection.readPeerAddress is responsible for reading the cookie, JGroups version, and IP address. Connection.run then runs a loop that
      > reads a length and Message and processes them.

      > The buffer size is 1,650,814,049 => 0x62656c61 => "bela".

      > So it looks like Connection.readPeerAddress is not pulling anything
      > off the socket input stream before Connection.run is called, so run
      > reads the cookie, thinks it's the buffer length to use, and starts
      > reading until the buffer is full.

      Dennis and I've spent some time looking at how this might happen, but it doesn't appear that the code is the problem.

      The server side doesn't appear to be the problem because the 'bela' cookie/array is only received once (upon the reception of a connection), yet, for the code path to be what it appears, that array would have to be in the input stream twice.

      The client side doesn't appear to be the problem as connections are created inside a synchronized block, and each connection is only initialized once (therefore the 'bela' cookie/array is only sent once).

      Dennis currently suspects a JVM issue with threads, but we're just guessing at this point.

      Do you have any thoughts on what might be happening here (there are other, more recent, private comments on the case)?

      Attachments

        Activity

          People

            rhn-engineering-bban Bela Ban
            rhn-engineering-bban Bela Ban
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: