Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-6414

[GSS](7.0.z) loose of domain controller manageabilty after an OOM

    XMLWordPrintable

Details

    • Workaround Exists
    • Hide

      Kill the instance and restart the servers.

      Show
      Kill the instance and restart the servers.
    • Hide

      1. Configure an EAP environment with 1 Domain Controller, 2 Host Controllers with 1 server each;
      2. Access the domain controller and start the servers.
      3. Deploy a restful service with the following code:
      while(true){

      new Thread(new Runnable(){
      public void run() {
      try

      { Thread.sleep(10000000); }

      catch(InterruptedException e) { }
      }
      }).start();
      }

      3. Fire a request to the rest endpoint and wait to the OOM unable to create new native thread.

      4. Try to make any operation at web console.

      The bugzilla reports the issue: https://bugzilla.redhat.com/show_bug.cgi?id=1259767

      Show
      1. Configure an EAP environment with 1 Domain Controller, 2 Host Controllers with 1 server each; 2. Access the domain controller and start the servers. 3. Deploy a restful service with the following code: while(true){ new Thread(new Runnable(){ public void run() { try { Thread.sleep(10000000); } catch(InterruptedException e) { } } }).start(); } 3. Fire a request to the rest endpoint and wait to the OOM unable to create new native thread. 4. Try to make any operation at web console. The bugzilla reports the issue: https://bugzilla.redhat.com/show_bug.cgi?id=1259767
    • EAP 7.0.4

    Description

      Enterprise customers usually deploy a domain controller with a lot of host controllers and managed servers on their production environments. If a managed server gets an out of memory ( unable to create new native thread ) the domain controller get's unresponsive.
      To reproduce the out of memory we' used:

      ...
      while(true){

      new Thread(new Runnable(){
      public void run() {
      try

      { Thread.sleep(10000000); }

      catch(InterruptedException e) { }
      }
      }).start();
      }

      Actual results:

      • Once you have the OOME, go to the management console. You should notice some misbehavior with it, such as servers not being listed, or the metrics aren't being updated anymore;
      • After restarting the host controller, everything should be back working again.

      Expected results:
      It should not impact the management console. The deployment action should recover from the error or kill the server automatically.

      Attachments

        Issue Links

          Activity

            People

              chaowan@redhat.com Chao Wang
              rhn-support-bmaxwell Brad Maxwell
              Pavel Jelinek Pavel Jelinek
              Pavel Jelinek Pavel Jelinek
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: