-
Bug
-
Resolution: Won't Do
-
Major
-
None
-
1.2.8.Final
-
None
-
None
I've noticed several tests that run O.K. with Oracle JDK7 and fail with Oracle JDK8. One can reproduce these failures anytime, they are not just random glitches.
Notes on the setup:
- simplecontext-666 is a simple JSP web app, non-distributable
- clusterbench is a distributable webapp
- all the tests use 1 worker node and 1 balancer
- many clients throwing many requests at the simplecontext-666 context
Stop Context Timeout test
First, there is the setup with stop-context-timeout="20":
<subsystem xmlns="urn:jboss:domain:modcluster:1.2"> <mod-cluster-config advertise-socket="modcluster" stop-context-timeout="20" connector="ajp"> <dynamic-load-provider> <load-metric type="busyness"/> </dynamic-load-provider> </mod-cluster-config> </subsystem>
On JDK7, one may observe exactly the desired behavior. On a clean shutdown, message Failed to drain 294 remaining pending requests from default-host:/simplecontext-666 within 20.0 seconds is displayed. See the whole log here: 1-server.log.
[org.jboss.web] (ServerService Thread Pool -- 22) JBAS018224: Unregister web context: /simplecontext-666 [org.jboss.as.server.deployment] (MSC service thread 1-8) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 284ms [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000024: All active sessions drained from default-host:/simplecontext-666 in 20.0 seconds [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000022: Failed to drain 294 remaining pending requests from default-host:/simplecontext-666 within 20.0 seconds [org.jboss.modcluster] (ServerService Thread Pool -- 9) MODCLUSTER000002: Initiating mod_cluster shutdown [org.apache.coyote.ajp] (MSC service thread 1-8) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 [org.apache.coyote.ajp] (MSC service thread 1-8) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 [org.jboss.as] (MSC service thread 1-11) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 20035ms
On the other hand, with JDK8, this utterly fails. No waiting for pending requests, no message. See the whole log here: 2-server.log.
[org.jboss.web] (ServerService Thread Pool -- 18) JBAS018224: Unregister web context: /simplecontext-666 [org.apache.tomcat.util] (http-/10.16.88.17:8080-Acceptor) JBWEB003022: Channel processing failed [org.jboss.modcluster] (ServerService Thread Pool -- 16) MODCLUSTER000002: Initiating mod_cluster shutdown [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 [org.jboss.as.server.deployment] (MSC service thread 1-10) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 310ms [org.jboss.as] (MSC service thread 1-2) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 348ms
These are the Apache HTTP Server logs, showing exactly what was the test doing:
Note that thousands of "GET /simplecontext-666/ HTTP/1.1" messages were omitted from these snippets.
JDK7, access_log, (See the complete log here 1-access_log.zip)
[06:47:04] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [06:47:12] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:47:12] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:47:12] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:47:12] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:47:12] "STATUS / HTTP/1.1" 200 60 "-" "ClusterListener/1.0" [06:47:13] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [06:47:19] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:47:19] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:47:20] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" [06:47:40] "STOP-APP / HTTP/1.1" 200 86 "-" "ClusterListener/1.0" [06:47:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:47:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:47:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0"
JDK8, access_log, (See the complete log here 2-access_log.zip)
[06:43:10] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [06:43:17] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:43:17] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:43:17] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:43:17] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:43:17] "STATUS / HTTP/1.1" 200 62 "-" "ClusterListener/1.0" [06:43:19] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [06:43:25] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:43:25] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" [06:43:25] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:43:25] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [06:43:25] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0"
In the aforementioned snippets, it is noteworthy that the JDK8 run sent only one STOP-APP message. This is confirmed in error logs:
JDK7, error_log, grep STOP-APP (See the complete log here 1-error_log.zip)
[06:47:20] mod_manager.c(1901): manager_trans STOP-APP (/) [06:47:20] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" [06:47:20] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 [06:47:20] mod_manager.c(2666): manager_handler STOP-APP OK [06:47:40] mod_manager.c(1901): manager_trans STOP-APP (/) [06:47:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fsimplecontext-666" [06:47:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 294 [06:47:40] mod_manager.c(2666): manager_handler STOP-APP OK
JDK8, error_log, grep STOP-APP (See the complete log here 2-error_log.zip)
[06:43:25] mod_manager.c(1901): manager_trans STOP-APP (/) [06:43:25] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" [06:43:25] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 [06:43:25] mod_manager.c(2666): manager_handler STOP-APP OK
One could argue that there was a test error and the JDK8 test run actually didn't access simplecontext-666 application. Let me dissolve this argument right away:
grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK8/httpd/logs/access_log 27250 grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK7/httpd/logs/access_log 27927
Default settings, the same problem...
Trial with the default settings reveals the same problem with some additional findings in the Apache's error log.
<subsystem xmlns="urn:jboss:domain:modcluster:1.2"> <mod-cluster-config advertise-socket="modcluster" connector="ajp"> <dynamic-load-provider> <load-metric type="busyness"/> </dynamic-load-provider> </mod-cluster-config> </subsystem>
On JDK7, one may observe exactly the desired behavior, this time with the default 10 seconds timeout. See the whole log here: 3-server.log.
[org.jboss.web] (ServerService Thread Pool -- 3) JBAS018224: Unregister web context: /simplecontext-666 [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 263ms [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000024: All active sessions drained from default-host:/simplecontext-666 in 10.0 seconds [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000022: Failed to drain 340 remaining pending requests from default-host:/simplecontext-666 within 10.0 seconds [org.jboss.modcluster] (ServerService Thread Pool -- 7) MODCLUSTER000002: Initiating mod_cluster shutdown [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 [org.apache.coyote.ajp] (MSC service thread 1-13) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 [org.jboss.as] (MSC service thread 1-9) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 10082ms
JDK8 fails. No waiting for pending requests, no message. This time even with ISE . See the whole log here: 4-server.log.
[org.jboss.web] (ServerService Thread Pool -- 16) JBAS018224: Unregister web context: /simplecontext-666 [org.jboss.modcluster] (ServerService Thread Pool -- 19) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.0 seconds [org.jboss.as.txn] (ajp-/10.16.88.17:8009-21) JBAS010151: Unable to get transaction state: java.lang.IllegalStateException at org.jboss.msc.value.InjectedValue.getValue(InjectedValue.java:47) at org.jboss.as.txn.deployment.TransactionRollbackSetupAction.checkTransactionStatus(TransactionRollbackSetupAction.java:112) at org.jboss.as.txn.deployment.TransactionRollbackSetupAction.teardown(TransactionRollbackSetupAction.java:66) at org.jboss.as.web.ThreadSetupBindingListener.unbind(ThreadSetupBindingListener.java:61) [jboss-as-web-7.4.0.Final-redhat-6.jar:7.4.0.Final-redhat-6] at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:184) [jbossweb-7.4.0.Final.jar:7.4.0.Final] at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:97) [jbossweb-7.4.0.Final.jar:7.4.0.Final] at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:102) [jbossweb-7.4.0.Final.jar:7.4.0.Final] at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:340) [jbossweb-7.4.0.Final.jar:7.4.0.Final] at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:490) [jbossweb-7.4.0.Final.jar:7.4.0.Final] at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:420) [jbossweb-7.4.0.Final.jar:7.4.0.Final] at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:926) [jbossweb-7.4.0.Final.jar:7.4.0.Final] at java.lang.Thread.run(Thread.java:744) [rt.jar:1.8.0] +++ [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 301ms [org.jboss.as] (MSC service thread 1-15) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 313ms
In the access logs, it is noteworthy that both clusterbench and simplecontext-666 sent ENABLE-APP, but there was only one STOP-APP message. Furthermore, who and why used OPTIONS method?
JDK7, access_log, (See the complete log here 3-access_log.zip)
Note that thousands of "GET /simplecontext-666/ HTTP/1.1" messages were omitted from these snippets.
[07:31:49] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [07:31:57] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:31:57] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:31:57] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:31:57] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:31:57] "STATUS / HTTP/1.1" 200 63 "-" "ClusterListener/1.0" [07:31:58] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [07:32:05] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:32:14] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" [07:32:15] "STOP-APP / HTTP/1.1" 200 86 "-" "ClusterListener/1.0" [07:32:15] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:32:15] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:32:15] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0"
JDK8, access_log, (See the complete log here 4-access_log.zip)
[07:29:25] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [07:29:32] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:29:32] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:29:32] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:29:32] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:29:32] "STATUS / HTTP/1.1" 200 63 "-" "ClusterListener/1.0" [07:29:34] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [07:29:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:29:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" [07:29:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:29:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:29:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:29:40] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)"
Error log shows what context was the STOP-APP message linked to:
JDK7, error_log, grep STOP-APP (See the complete log here 3-error_log.zip)
[07:32:15] mod_manager.c(1901): manager_trans STOP-APP (/) [07:32:15] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fsimplecontext-666" [07:32:15] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 340 [07:32:15] mod_manager.c(2666): manager_handler STOP-APP OK
JDK8, error_log, grep STOP-APP (See the complete log here 4-error_log.zip)
[07:29:40] mod_manager.c(1901): manager_trans STOP-APP (/) [07:29:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" [07:29:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 [07:29:40] mod_manager.c(2666): manager_handler STOP-APP OK
Default settings, again
I had considered the JBAS010151: Unable to get transaction state: java.lang.IllegalStateException being probably an unrelated glitch, so I tried the default test run with JDK8 again. Test failed, yet this time without any exception. Nothing has changed since the last run as far as environment and configuration are concerned.
JDK8 fails. No waiting for pending requests on /simplecontext-666. See the whole log here: 5-server.log.
[org.jboss.web] (ServerService Thread Pool -- 16) JBAS018224: Unregister web context: /clusterbench [org.apache.catalina.core] (MSC service thread 1-6) JBWEB001079: Container org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/] has not been started [org.apache.coyote.http11.Http11NioProtocol] (MSC service thread 1-14) JBWEB003075: Coyote HTTP/1.1 pausing on: http-/10.16.88.17:8080 [org.apache.coyote.http11.Http11NioProtocol] (MSC service thread 1-14) JBWEB003077: Coyote HTTP/1.1 stopping on : http-/10.16.88.17:8080 [org.jboss.web] (ServerService Thread Pool -- 3) JBAS018224: Unregister web context: /simplecontext-666 [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-15) JBAS010409: Unbound data source [java:jboss/datasources/ExampleDS] [org.jboss.modcluster] (ServerService Thread Pool -- 16) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.1 seconds [org.jboss.modcluster] (ServerService Thread Pool -- 14) MODCLUSTER000021: All pending requests drained from default-host:/clusterbench in 0.1 seconds [org.jboss.modcluster] (ServerService Thread Pool -- 14) MODCLUSTER000002: Initiating mod_cluster shutdown [org.apache.coyote.ajp] (MSC service thread 1-1) JBWEB003048: Pausing Coyote AJP/1.3 on ajp-/10.16.88.17:8009 [org.apache.coyote.ajp] (MSC service thread 1-1) JBWEB003051: Stopping Coyote AJP/1.3 on ajp-/10.16.88.17:8009 [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 9) JBAS010282: Stopped repl cache from web container [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 16) JBAS010282: Stopped default-host/clusterbench cache from web container [org.jboss.as.server.deployment] (MSC service thread 1-13) JBAS015877: Stopped deployment simplecontext-666.war (runtime-name: simplecontext-666.war) in 306ms [org.jboss.as.server.deployment] (MSC service thread 1-5) JBAS015877: Stopped deployment clusterbench.war (runtime-name: clusterbench.war) in 310ms [org.jboss.as] (MSC service thread 1-16) JBAS015950: JBoss EAP 6.3.0.Alpha2 (AS 7.4.0.Final-redhat-6) stopped in 333ms
JDK8, access_log, (See the complete log here 5-access_log.zip)
[07:45:25] "GET /mcm HTTP/1.1" 200 446 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [07:45:33] "INFO / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:33] "CONFIG / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:33] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:33] "ENABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:33] "STATUS / HTTP/1.1" 200 62 "-" "ClusterListener/1.0" [07:45:34] "GET /mcm HTTP/1.1" 200 1916 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)" [07:45:39] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" [07:45:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:40] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" [07:45:40] "STOP-APP / HTTP/1.1" 200 84 "-" "ClusterListener/1.0" [07:45:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:40] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:40] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" [07:45:40] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)" [07:45:41] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.26 (Red Hat Enterprise Web Server) (internal dummy connection)"
This time, there were two STOP-APP messages all right, but instead one of them being for clusterbench and the other one for simplecontext-666, they were both for clusterbench. I find this behavior being a rather disturbing one.
JDK8, error_log, grep STOP-APP (See the complete log here 5-error_log.zip)
[07:45:40] mod_manager.c(1901): manager_trans STOP-APP (/) [07:45:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" [07:45:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 [07:45:40] mod_manager.c(2666): manager_handler STOP-APP OK [07:45:40] mod_manager.c(1901): manager_trans STOP-APP (/) [07:45:40] mod_manager.c(2617): manager_handler STOP-APP (/) processing: "JVMRoute=jboss-eap-6.3&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2Fclusterbench" [07:45:40] mod_manager.c(1429): process_appl_cmd: STOP-APP nbrequests 0 [07:45:40] mod_manager.c(2666): manager_handler STOP-APP OK
Again, there were plenty of requests sent to simplecontext-666/:
grep -c "GET /simplecontext-666/ HTTP/1.1\" 200" ./targetJDK8/httpd/logs/access_log 27587
WDYT?
- is related to
-
WFLY-3942 Race condition with clean shutdown and mod_cluster session draining
- Closed