-
Bug
-
Resolution: Done
-
Blocker
-
7.3.0.CD18
-
None
The issue is about replicated-cache in fail-over tests.
EAP is started in clustered mode using a replicated cache for replicating HTTP session data across cluster nodes; all 4 nodes in the cluster are initialized with the following cli script:
embed-server --server-config=standalone-ha.xml /subsystem=jgroups/channel=ee:write-attribute(name=stack,value=tcp) /subsystem=infinispan/cache-container=web/replicated-cache=testRepl:add() /subsystem=infinispan/cache-container=web/replicated-cache=testRepl/component=locking:write-attribute(name=isolation, value=REPEATABLE_READ) /subsystem=infinispan/cache-container=web/replicated-cache=testRepl/component=transaction:write-attribute(name=mode, value=BATCH) /subsystem=infinispan/cache-container=web/replicated-cache=testRepl/store=file:add() /subsystem=infinispan/cache-container=web:write-attribute(name=default-cache, value=testRepl)
The test is run with jboss-eap-7.3.0.CD18-CR1.zip;
The same tests run with version jboss-eap-7.2.5.CP-CR1.zip do not have any problem;
hence this looks like a regression;
As usual, we test that the serial value stored in the scattered cache is incremented at every call: when this is not true, we say we have a sampling error;
Here are the runs that exhibit this issue:
- 22.23% Fail Rate with EAP-7.3 eap-7.x-clustering-http-session-shutdown-repl#27
- 0% Fail Rate with EAP-7.2 eap-7.x-clustering-http-session-shutdown-repl#28
We also repeated the tests with a slightly different MOD_JK configuration to make sure it can be reproduced:
- 22.45% Fail rate with EAP-7.3 eap-7.x-clustering-http-session-shutdown-repl#29
It's worth mentioning that the same tests performed using HAPRPOXY also exibiths an increase in fail-rate but not as much as with MOD_JK (which probably makes it more evident):
- 0.73% Fail Rate with EAP-7.3 eap-7.x-clustering-http-session-shutdown-repl-haproxy#17
- 0% Fail Rate with EAP-7.2 eap-7.x-clustering-http-session-shutdown-repl-haproxy#16
The MOD_JK workers.properties looks like the following:
worker.list=loadbalancer,status worker.node1.port=8009 worker.node1.host=10.16.176.60 worker.node1.type=ajp13 worker.node1.ping_mode=A worker.node1.lbfactor=1 worker.node1.retries=2 worker.node1.fail_on_status=404,503 worker.node2.port=8009 worker.node2.host=10.16.176.62 worker.node2.type=ajp13 worker.node2.ping_mode=A worker.node2.lbfactor=1 worker.node2.retries=2 worker.node2.fail_on_status=404,503 worker.node3.port=8009 worker.node3.host=10.16.176.56 worker.node3.type=ajp13 worker.node3.ping_mode=A worker.node3.lbfactor=1 worker.node3.retries=2 worker.node3.fail_on_status=404,503 worker.node4.port=8009 worker.node4.host=10.16.176.58 worker.node4.type=ajp13 worker.node4.ping_mode=A worker.node4.lbfactor=1 worker.node4.retries=2 worker.node4.fail_on_status=404,503 worker.loadbalancer.type=lb worker.loadbalancer.balance_workers=node1,node2,node3,node4 worker.loadbalancer.sticky_session=1 worker.status.type=status
- clones
-
WFLY-12718 Clustering: replicated-cache sampling errors
- Closed