Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2135

OOM with JGroups 3.6.11.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 3.6.12, 3.6.20.Final
    • 3.6.11
    • None

      We are running our JVMs with : -XX:OnOutOfMemoryError="kill -9 %p"

      we have been experiencing OOMs fairly often, and the OOMs happen at:

      Object / Stack Frame                                                              |Name                                                                                             | Shallow Heap | Retained Heap |Context Class Loader                         |Is Daemon
      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      java.lang.Thread @ 0x81bdf838                                                     |Connection.Receiver [144.77.77.53:50363 - 144.77.77.53:50363],sis-cluster.service,prodpmwsv5-6461|          120 |           456 |sun.misc.Launcher$AppClassLoader @ 0x800175a8|false
      |- at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)             |                                                                                                 |              |               |                                             |
      |- at org.jgroups.blocks.cs.TcpConnection$Receiver.run()V (TcpConnection.java:310)|                                                                                                 |              |               |                                             |
      |- at java.lang.Thread.run()V (Thread.java:745)                                   |                                                                                                 |              |               |                                             |
      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      

      the Code where it happens is in TcpConnection.java:

      while(canRun()) {
                      try {
                          int len=in.readInt();
                          if(buffer == null || buffer.length < len)
                              buffer=new byte[len];
                          in.readFully(buffer, 0, len);
                          updateLastAccessed();
                          server.receive(peer_addr, buffer, 0, len);
                      }
                      catch(OutOfMemoryError mem_ex) {
                          t=mem_ex;
                          break; // continue;
                      }
                      catch(IOException io_ex) {
                          t=io_ex;
                          break;
                      }
                      catch(Throwable e) {
                      }
                  }
      

      when allocating: buffer=new byte[len];

      it looks to me that some invalid large value is received and the process OOMs when allocating a huge byte array

      Running JVMs without kill on OOM would make this issue "dissapear" in the sense that it is swallowed by:

                      catch(OutOfMemoryError mem_ex) {
                          t=mem_ex;
                          break; // continue;
                      }
      

      Handling OutOfMemoryError is a strange implementation choice...
      instead a size limit should be employed to protect from receiving invalid sizes...

      My heap limit is 1GB and my heap dumps are 50Mb so the attempted allocation size is huge...

              rhn-engineering-bban Bela Ban
              zolyfarkas_jira Zoltan Farkas (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: