Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-8214

Artemis doesn't handle JDBC network problems

XMLWordPrintable

    • Hide

      See instructions how to setup your environment before you can run the test.

      Run following commands:

      git clone git@gitlab.mw.lab.eng.bos.redhat.com:jbossqe-eap/messaging-cloud-testsuite.git
      cd messaging-cloud-testsuite
      mvn clean install -Deap=7x -Deap.version=7.1.0.DR17 -DfailIfNoTests=false -DstartNodesDelay=0 -Dtest=JDBCNetworkFailureTestCase#disconnecNetworkAfterCommit | tee log
      

      Server logs and the configuration you can find in directory messaging-cloud-testsuite/eap7-tests/target/org.jboss.qa.messaging.tests.eap7.JDBCNetworkFailureTestCase.disconnecNetworkWithMDB

      Show
      See instructions how to setup your environment before you can run the test. Run following commands: git clone git@gitlab.mw.lab.eng.bos.redhat.com:jbossqe-eap/messaging-cloud-testsuite.git cd messaging-cloud-testsuite mvn clean install -Deap=7x -Deap.version=7.1.0.DR17 -DfailIfNoTests= false -DstartNodesDelay=0 -Dtest=JDBCNetworkFailureTestCase#disconnecNetworkAfterCommit | tee log Server logs and the configuration you can find in directory messaging-cloud-testsuite/eap7-tests/target/org.jboss.qa.messaging.tests.eap7.JDBCNetworkFailureTestCase.disconnecNetworkWithMDB
    • AMQ Sprint 1

      If the network goes down between Artemis and DB, the Artemis should behave in the same way as in case that journal storage is used and underlying network file system is disconnected. It should throw an critical IO error and stop itself.

      Currently if network is down, JDBC calls hang until OS tcp timeout expires (typically 10 minutes). It contradicts fail fast pattern.
      This behavior can be changed by setting networkTimeout [1] property to non zero value. I think this timeout should be configurable and default value should be less than 30 seconds what is default timeout for client's blocking operations.

      If JDBC connection is closed from any reason (expiration of tcp timeout or networkTimeout), Artemis should throw critical IO error and stop itself.
      Currently even if JDBC connection is closed, Artemis tries to execute DB operations on it what causes throwing of exceptions. Artemis is not able to recover from this state and it must be restarted.

      Customer impact: If the network goes down between Artemis and DB, there is no error in server log for 10 minutes. During this time clients are blocked without any explanatory exception. It contradicts fail fast pattern and is difficult to find out what is wrong.

      If JDBC connection is closed after 10 minutes, clients are still successfully connected to Artemis but they get exception for all operations. Since their connections are still active, they don't reconnect to other Artemis instance.

      [1] https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#setNetworkTimeout(java.util.concurrent.Executor,%20int)

              fnigro Francesco Nigro
              eduda_jira Erich Duda (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: